OK, not the most original title. But quite apt as I really don't intend to blog very much or very seriously.
See the archive of all posts for what little is there.
If you want to comment on an entry, please do so by email. I will post updates if you send relevant stuff my way.
debmirror 2.3 should be hitting the mirrors about now. Main change is that
it will now use the available diffs to update Contents files, which should
give a nice bandwidth reduction for users who mirror those.
With that the option --pdiff (for "package diff") no longer really covered
its function, so I decided to change it to --diff.
There's also a fix for mirroring archives that don't have a Release file.
Question for users
The option --add-dir has been marked as deprecated (for quite some time now I
suspect). I'm considering to remove it in the next release as I cannot see any
use cases for it, but it's quite possible I'm missing something and there are
still people using it. If you would like that option preserved, then please
mail me at debmirror@packages.d.o with an explanation of why and how you
use it.
Managing the size of a local mirror
The archive has grown a lot over the past Debian releases and keeping even a
partial local mirror can require quite some disk space. Luckily debmirror
offers quite a few options to tune what is mirrored.
My own mirror covers testing and unstable 'main' for 6 architectures (i386, amd64, armel, hppa, sparc and s390), no source, no D-I images. It uses only 61G. I say "only" as that's about 33GB less than it could have been without tuning. In other words, I'm saving a bit more than one third!
Here are the options I added to achieve this:
--exclude-deb-section='^debug$'
--exclude='/(xen-)?linux-[a-z]+-2\.6[.0-9]*-[-[:alnum:]]*(openvz|vserver|xen)[-[:alnum:]]*_'
--exclude='(k/kde|g/gnome|o/openoffice\.org).*/.*_(armel|hppa|s390)\.deb'
--exclude='(a/axiom/|d/debian-edu-doc/|e/ember(|-media)/|e/eclipse(/|-))'
--exclude='(e/erlang|g/(gcl(cvs)?|ghc6)/|l/llvm(/|-)|p/paraview/|o/openturns/)'
--exclude='(s/scalapack(-doc)?/|f/festvox-|g/gcc-snapshot/)'
--exclude='(/acl2-books_|/digikam-doc_|/fluid-soundfont-gm_|/deal.ii-doc_)'
--exclude='(/libxmpp4r-ruby-doc_|/lilypond-doc_|/qt4-doc_|/vtk-doc_)'
--exclude='/i18n/Translation-.*\.bz2' --include='/i18n/Translation-(nl|de)\.bz2'
And the explanation is:
- I rarely use debug packages and they are relatively big; if I do need one I'll download it manually from a remote mirror.
- I don't run vserver or xen kernels (and if I did I'd probably compile custom kernels anyway). I do want "regular" kernels because of D-I work.
- I doubt I'll ever want to install KDE, GNOME or OpenOffice on my armel, hppa or s390 boxes, but I do want them for the other three arches.
Selected individual (mostly scientific) source packages that I doubt I'll ever use but use up significant disk space (and bandwidth when updated). These were found by a simple:
du -s pool/main/*/* | sort -rn | head -n 50Selected individual huge binary packages (mostly documentation), found using:
du -s pool/main/*/*/*.deb | sort -rn | head -n 50I'm only interested in Dutch and German translations of package descriptions. Well, actually I'm not even interested in those, but it's useful to have them for testing
debmirror.
Obviously I have nothing against any of the packages that I exclude. It's just that I don't need them.
Posted Sat Oct 3 18:09:08 2009Yay! I've done it: 1160 lines of bash script are now 1215 lines of perl, and:
'debtree aptitude': 1m2.832s -> 0m0.596s
The new release is available as version 0.9.9 from the
debtree web site and has been
uploaded for the archive as version 1.0.
This was the starting position, the run time for my complete test set:
real 22m33.583s
user 18m29.709s
sys 4m21.320s
I began with a pure language conversion from bash to perl, i.e. I kept
the call-outs to dctrl-tools. This allowed me to easily identify problems
in the language conversion by running my test suite, without having to worry
that a change might have been caused by getting different data.
The language conversion itself was fairly straightforward; most time was
spent on finding all the little errors made during the conversion.
This resulted in "only" a 10% speedup:
real 20m56.368s
user 18m3.996s
sys 2m46.986s
So bash itself isn't even horribly slower than perl, even with all the
recursion and starting of subshells for calls to grep, sed, etc.
Then I replaced the call-outs to dctrl-tools one by one, adding the
dependency on libapt-pkg-perl. And that resulted in the amazing:
real 0m21.350s
user 0m19.797s
sys 0m1.372s
So, from 22 minutes to 21 seconds for 22 graphs, including some pretty complex ones. Not bad.
I had to keep a call-out to dctrl-tools for build dependencies as it
turned out libapt-pkg-perl
does not expose architecture conditions.
The full conversion process can be seen in the source repository, which was recently moved from my $HOME on alioth to collab-maint.
Posted Wed Sep 16 12:21:02 2009debtree 0.8.0, including the
new option to
display reverse dependencies,
is now officially (or rather: unofficially) available.
The new feature is of course documented in the man page, but also on the website.
And now I think the time has come to port the script to perl. If I manage
that I plan to upload the package into the archive as version 1.0.
P.S. debtree now also supports generating trivial graphs:
$ debtree --max-depth=0 dpkg
Funnily enough that same graph is less trivial for apt. Support for
--max-depth=0 was added to allow to generate graphs showing only
reverse dependencies.
I've just uploaded version 2.2 of debmirror, which introduces yet another
new feature: mirroring the i18n/Translation files that contain translations
of package descriptions. Many thanks to Joerg Jaspert for his quick response
to my request to include those files in the
Release file.
Joerg also implemented the change needed to use the diffs for Contents files
but that requires a fairly big code restructuring in debmirror.
The package has jumped from version 1.0 to 2.2 in just three weeks (closing 28 bug reports in the process), but I think the changes justify that. Here's an overview.
Automatic creation and update of
suite->codenamesymlinks (1.0)This also means it no longer makes any difference whether you tell
debmirrorto mirrorsidorunstable.Option to cache the mirror state between runs (2.0)
This significantly reduces the trashing of the hard disk during mirror updates and cleanup, and improves the efficiency of individual runs.
The disk trashing has always been the main reason I did not want to do more than one update per day for my local mirror. Now it hardly matters how many runs I do: almost everything is done based on the cache data.
To ensure the mirror stays consistent the cache has a (configurable) maximum life time after which a full check of the mirror will be done, if desired including an md5sum check of all files.
Significant speed increase for parsing
Packages/Sourcesfiles (2.0)For my mirror that stage now takes seconds rather than minutes. Additional speed increases should be possible in the stage that fetches the
PackagesandSourcesfiles.Mirroring of "current" Debian Installer images (2.0)
Which architectures and suites should be mirrored can be specified independently from the rest of the mirror.
Mirroring additional files from specific directories (2.1)
This allows mirroring of "trace files", of the contents of the
./docand./toolsdirectories (which are needed if you want to create CD images usingdebian-cd, and of the./indicesdirectory.The transfer method used for this is always
rsync, independent of the transfer method used for the rest of the archive. This is a restriction, butrsyncis also the only usable option for files for which no real index or checksums are available.Mirroring translation files (2.2)
As
debmirroris primarily intended to be used for local, often partial, mirrors, it is of course possible to mirror only selected languages. Interested only in German and French translations? Simple, just use:--i18n --exclude='/Translation-.*\.bz2$' --include='/Translation-(de|fr).*\.bz2$'I've used '
(de|fr).*' so that also country-specific variants (e.g.fr_FR) will be included.
If you're currently using the Lenny version of debmirror and would like to
use the new features: the package from unstable can be installed on Lenny
without any problems. The changes have been well tested, but I would advice to
do use --dry-run after the upgrade to check there are no unexpected problems.
One area where you may experience problems is when using debmirror for
other archives than the official Debian mirrors. If you do encounter issues
then please file a bug report.
Note that debmirror is not intended to be used for official mirrors.
There are different scripts
available for that from the Debian mirror team.
Funny how working on a program immediately inspires to do more.
Remember that the initial motivation for debtree was to find out why a package was installed? It can now show that in the dependency graphs!
I'm not quite ready to do a new release, but the new version is available from the git repository.
Let's start with a simple example (all graphs are based on Lenny).
$ debtree -I --rdeps-depth=3 apt
Only installed packages are displayed here; if the -I option is omitted,
debmirror will display all, but that does tend to explode the graphs,
especially for common libraries. As for forward dependencies, the color of
the arrows indicates Pre-Depends, Depends and Recommends.
The reverse dependencies are shown three levels deep (one is default).
The graph will always include all direct reverse dependencies (both on the
package itself and all virtual packages provided by it). For indirect reverse
dependencies there's a cut off that is set at five by default. Example is
debconf, that apparently has 9 reverse Pre-Depends and 58 reverse Depends
installed on my system.
The next one is simply beautiful.
$ debtree -I --rdeps-depth=20 --no-conflicts libcairo2
Because of the --rdeps-depth=20 this shows the full recursion! I was
surprised that this graph remained a reasonable size. Apparently no packages
depend on the virtual package libcairo, at least none that I have installed.
The final one is extreme, and I must confess that I have cheated a bit by
suppressing the least interesting reverse depends (which explains why it does not
match the numbers from the apt graph).
$ debtree -I -R --no-recommends --no-conflicts debconf
The most interesting thing here is how it shows the debconf-2.0 transition.
Most packages depend on 'debconf|debconf2.0'; tex-common instead has
'debconf|cdebconf', while tasksel and exim4 have both combinations
(probably one explicitly in debian/control and the other added by debhelper.
ucf is missing the alternative; apparently does not use debhelper (no
prizes for guessing who the maintainer is :-).
Notice anything about iamerican and ibritish? Yes, they really have a
double dependency on 'debconf|debconf2.0'.
The one thing missing is the version info for versioned dependecies. Not sure yet if I want to add that for reverse dependencies.
P.S. SVG versions of the images are available in the same directory as the JPGs.
Posted Tue Sep 8 14:38:19 2009It's been quite some time (almost two years) since my previous "release"
of debtree, but now version 0.7.3 is
available.
And it still generates very nice graphs 
The changes are relatively minor: a few nice fixes for corner cases that
were not handled correctly, and an update of the default lists of "skip"
and "end" packages which help to limit the size of graphs for a fair number
of packages I tried (including konqueror and openoffice.org).
Reason to revisit debtree was a recent nice mail from a debtree user, but
also the current discussions about udev and the
FHS. I'm on
the side of "let's please keep /usr mountable separately". Mostly because
I like a (small) encrypted root with a separate (large) unencrypted `/usr'.
I'm also increasingly unhappy with the default size of Debian's desktop
installs, especially now that it looks as if Squeeze will see installation
of Recommends by default by tasksel (and thus Debian Installer).
For comparison, the size of a default Gnome desktop install for Etch was 1360MB; for Lenny it is 1830MB; for Squeeze it looks like it will be well over 3000MB! Remember that for Sarge we installed both Gnome and KDE from CD1 with both together taking 1390MB?
Sure, some of that is real functionality, but a lot is also (IMO) redundant
visual effects that only serve to slow the desktop down and junk needed to
do stuff automagically. And a heck of a lot is duplicated functionality.
One of the main reasons I switched to Linux was because it gave me back control
over my systems, but with KDE4 and pervasive stuff like hal and all the
various "kits" Linux is on a fast track that's giving priority to flashiness
over real functionality and eroding that control.
Here's a fairly default dependency graph for hal (click for full image).
Looks reasonable, right?
But that's only because most major dependencies, such as dbus, policykit
and pm-utils have been pruned. Here's a complete graph, with only libc6
omitted (full image is 1.5MB). Truly a tangled web. Scary.
One can also look at it from the other side. Today I upgraded my sid chroot
and found I suddenly needed to install libavahi-client3, libavahi-common3,
libavahi-common-data and libdbus-1-3. Why? Reason turned out to be
libcups2, so I checked if I really needed that. And here's why I do.
Most of these dependencies of libgtk2.0-0 I can understand, but isn't gtk
supposed to be a graphical toolkit library? Couldn't printing support be
implemented in some more specialized Gnome printing toolkit library?
But I'm probably missing something.
Anyway, now that I have a bit more perl experience through my
recent work on
debmirror, maybe I should finally
port debtree from shell script to perl...
See the debtree home page for a
full overview of how to read the graphs, but here's a quick intro.
Purple arrows are Pre-Depends, blue are Depends and black are Recommends;
green connections show Provides. The green packages are currently installed
in my sid chroot, while the white ones are not. The diamonds show where the
graph has been pruned: dependencies for these packages are not shown.
(Yeah, I've already had one post about debmirror, so one could argue this should be II, but a II without a I is also strange.)
In my second upload I've changed the package versioning from date-based to a more regular major.minor.bugfix, so, after a third bugfix upload we're now at 1.0.1. And with a nice set of changes too.
A nice bonus is a new script that gives a quite detailed overview of our archive size, at least a lot more detailed than this. It currently requires manual formatting, but if there is interest that could be coded and the script could be run cronned, on merkel for example.
Main changes in version 1.0
- Automatically create and update suite->codename symlinks based on info in the Release file. Directories for dists will always have the codename of the release. Conversion of existing mirrors that use suites for directories is supported.
- No longer keep uncompressed Packages files on the local mirror, similar to the official mirrors.
- Don't fetch (the huge) Contents files if they're unchanged. This is a significant improvement, but hopefully debmirror can soon support the diffs #436027.
Work in progress
I've also started work on a few new features.
Cashing the state of the mirror
Debmirror has two places where it's quite slow and where it trashes the hard disk a lot:
- when it checks md5sums for all files listed in Packages and Sources files;
- when it cleans up obsolete files.
This wishlist BR (#483922) has the solution:
to cache the state of the archive. After all, other than for meta data, 1) is
not really needed as nothing is normally going to change packages and source
files that have been downloaded. And 2) can be done much more efficently if
you already know what you need to do than when you have to run find over
the whole archive.
After thinking about it a bit, the implementation turned out to be quite easy, and I now have a version that I'm ready to try on my own local mirror. After that I'll just need to add a few bells and whistles, so expect this in version 1.1.
Activating the cache will be through --state-cache-days=<N>. It seems
wise to periodically do a full check of the mirror (the current mode of
operation). The <N> does just that. Whether to do it every 7 or 28 or
350 days is up to the user (I would suggest 7 or 14).
Mirroring Debian Installer images
This is a very old wishlist item #154966, but actually quite straightforward to implement as D-I does include index files with md5sums with its images. So I'll give that a shot soon.
Mirroring the tools/ and doc/ directories will be harder as they currently
lack index files with md5sums.
Various
I also have a couple of branches that need further thought and work.
- Improved accounting of download size. Quite complex.
- More generic support for subsections (such as main/debian-installer).
Errors in dmesg annoy me and some of my contributions to the Linux kernel are powered by the desire to get rid of them. Not that error messages in themselves are so bad, but there is always the chance that they indicate a real or some deeper problem.
But fixing them is less easy when it comes to ACPI errors.
On my HP 2510p notebook I've been seeing this:
ACPI Exception: AE_AML_PACKAGE_LIMIT, \
Index (000000005) is beyond end of object 20090521 exoparg2-445
ACPI Error (psparse-0537): Method parse/execution failed \
[\_SB_.C2C3] (Node ffff88007e01eea0), AE_AML_PACKAGE_LIMIT
ACPI Error (psparse-0537): Method parse/execution failed \
[\_SB_.C003.C0F6.C3F3._STM] (Node ffff88007e044de0), AE_AML_PACKAGE_LIMIT
I'm finally starting to be able to read disassembled ACPI code a bit, and that allowed me to finally trace this to a bug in the SSDT1 table. AFAICT it is harmless, as the value on which it fails is never used, but still.
So the problem is that the SSDT1 table defines two "packages" (which I understand to be similar to arrays; index starts at 0):
C3F4 { 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 } /* six elements */
C3F5 { 0x00, 0x00, 0x00, 0x00, 0x00 } /* five elements */
Then in method \_SB_.C003.C0F6.C3F3._STM there are two calls to another
method C2C3, using four arguments (Arg0 to Arg3):
\_SB.C2C3 (Local2, Local3, Local1, C3F4)
\_SB.C2C3 (Local2, Local3, Local1, C3F5)
And that method C2C3, which is defined in the DSDT table, does the
following in two branches:
Store (Local2, Index (Arg3, 0x05))
Store (0x00, Index (Arg3, 0x05))
The first line tries to store value Local2 in the sixth element of the
package passed as Arg3, i.e. either C3F4 or C3F5.
In the case where Arg3 is C3F4, this works fine, but in the case where
Arg3 is C3F5 it fails because that package only has five elements.
A few weeks ago, when it was fairly hot outside, my notebook suddenly decided to shut down while compiling a new kernel. Problem was that the processor kept going at full speed right until the end, something that's not supposed to happen. Thermal monitoring by ACPI and the kernel is supposed to throttle the system before it gets to critical temperatures.
But that only works if the system is designed correctly. Now I must admit that the fan was already failing (it's in the mean time been replaced by HP time, under warranty) so that may have contributed, but it was still working sufficiently.
What also contributed is that I have the notebook in a docking station. This is great, but its design is such that it basically block most of the air flow from the fan...
After an extensive investigation with the excellent help of kernel developer Zhang Rui, the cause was found to be in the thermal zones defined in the notebook.
There are 6 thermal zones. Below some info from /proc/acpi/thermal/.
TZ0/temperature:temperature: 60 C
TZ0/trip_points:critical (S5): 256 C
TZ0/trip_points:passive: 99 C: tc1=1 tc2=2 tsp=300 devices=CPU0 CPU1
TZ1/temperature:temperature: 60 C
TZ1/trip_points:critical (S5): 110 C
TZ3/trip_points:critical (S5): 105 C
TZ3/trip_points:passive: 95 C: tc1=1 tc2=2 tsp=300 devices=CPU0 CPU1
TZ4/trip_points:critical (S5): 110 C
TZ4/trip_points:passive: 60 C: tc1=1 tc2=2 tsp=300 devices=CPU0 CPU1
TZ5/temperature:temperature: 50 C
TZ5/trip_points:critical (S5): 110 C
TZ6/temperature:temperature: 25 C
TZ6/trip_points:critical (S5): 70 C
TZ6/trip_points:passive: 60 C: tc1=1 tc2=2 tsp=300 devices=CPU0 CPU1
The key here is the "passive" and "critical" trip points. At the first, the processor will be throttled. At the second the system will enter into an emergency shutdown. TZ0 is the zone monitoring the temperature of the processor itself; I'm not sure what exactly the other zones correspond to.
You may have noticed that zones TZ1 and TZ5 do not have a passive trip point. And that was exactly the problem. Some testing showed that those two zones can get quite high temperatures while the other zones are still OK. So the thermal protection never triggers and the system gets shut down when the critical limit for zone TZ1 or TZ5 is reached.
But as the system is running open source software there's a solution. A kernel patch makes it possible to load a custom DSDT ACPI table from the initramfs initrd. So after diving into the DSDT code I came up with three small modifications, and I now have an extra passive trip point for TZ1:
TZ1/temperature:temperature: 60 C
TZ1/trip_points:critical (S5): 110 C
TZ1/trip_points:passive: 95 C: tc1=1 tc2=2 tsp=300 devices=CPU0 CPU1
Now that's one of the main reasons I still enjoy working with Linux so much. Maybe I should also fix that crazy critical temperature for TZ0...
Update
Via IRC Matthew Garrett offered the following alternative:
If there's no passive zone defined in the DSDT then there should be a
/sys/class/thermal/thermal_zoneN/passivethat you can write a temperature into. So you can do this without a custom DSDT.
I gave it a quick try for TZ5, but the result was that the system became very sluggish and any task suddenly used much more CPU than normal. In principle it seems like a very nice alternative, but it looks as if there's a bug to be fixed first.
Update 2
The performance problem was due to a very simple mistake. I entered the
temperature in degrees Celcius instead of in millidegrees Celcius, which
immediately caused the system to be throttled down to unusability. Well,
at least that proved the mechanism works 
So now I have the following simple lines in /etc/sysfs.conf:
class/thermal/thermal_zone2/passive = 95000
class/thermal/thermal_zone5/passive = 95000
(thermal_zone2 matches TZ1 because of some weird ordering in ACPI.)
Posted Thu Aug 20 14:44:07 2009A few weeks ago I volunteered to adopt debmirror. It's a package I've been using for a long time myself and I expect maintaining it will give my (currently very basic) perl skills a boost.
The package is not in bad shape. It mostly just does what it needs to do, but there are a few interesting (wishlist) bug reports.
I've already done one upload (just migrated to testing) with mostly minor changes, mainly modernizing the packaging (using the magic 'dh' command). I'm now preparing another upload with the results of initial bug triaging and some general improvements.
As there already was an alioth project for debmirror, I've started using that. My changes can be seen in subversion, although I have some work-in-progress that I keep in a local git-svn checkout.
One wishlist bug concerned adding support to download the translations of package descriptions. Problem there is that debmirror currently only really supports mirroring files that are listed in the Release file (either directly or indirectly) and the i18n files are currently not listed.
But that may soon be fixed if the FTP-masters accept my patch to dak which adds an Index file for the translation files and lists that file in Release. After that supporting the package translation files in debmirror should be fairly straightforward.
So with that I've doubled the number of packages I maintain: from 1 to 2 





