log/ posts/ debmirror I

(Yeah, I've already had one post about debmirror, so one could argue this should be II, but a II without a I is also strange.)

In my second upload I've changed the package versioning from date-based to a more regular major.minor.bugfix, so, after a third bugfix upload we're now at 1.0.1. And with a nice set of changes too.

A nice bonus is a new script that gives a quite detailed overview of our archive size, at least a lot more detailed than this. It currently requires manual formatting, but if there is interest that could be coded and the script could be run cronned, on merkel for example.

Main changes in version 1.0

Work in progress

I've also started work on a few new features.

Cashing the state of the mirror

Debmirror has two places where it's quite slow and where it trashes the hard disk a lot:

  1. when it checks md5sums for all files listed in Packages and Sources files;
  2. when it cleans up obsolete files.

This wishlist BR (#483922) has the solution: to cache the state of the archive. After all, other than for meta data, 1) is not really needed as nothing is normally going to change packages and source files that have been downloaded. And 2) can be done much more efficently if you already know what you need to do than when you have to run find over the whole archive.

After thinking about it a bit, the implementation turned out to be quite easy, and I now have a version that I'm ready to try on my own local mirror. After that I'll just need to add a few bells and whistles, so expect this in version 1.1.

Activating the cache will be through --state-cache-days=<N>. It seems wise to periodically do a full check of the mirror (the current mode of operation). The <N> does just that. Whether to do it every 7 or 28 or 350 days is up to the user (I would suggest 7 or 14).

Mirroring Debian Installer images

This is a very old wishlist item #154966, but actually quite straightforward to implement as D-I does include index files with md5sums with its images. So I'll give that a shot soon.

Mirroring the tools/ and doc/ directories will be harder as they currently lack index files with md5sums.

Various

I also have a couple of branches that need further thought and work.