On Sat, Nov 20, 2010 at 09:46:56AM +0900, Charles Plessy wrote: > Le Fri, Nov 19, 2010 at 11:06:51AM +0100, Holger Levsen a écrit : > > CDs wont grow in size and I'm pretty sure > > the desktop install sizes wont shrink. > > there is actually room for sparing a couple of megabytes, by chosing a more > efficient compression algorithm for the data tar archive in binary packages > like gimp for instance.
Oh wow, going .gz -> .xz reduces the size in half -- and not only for gimp-data but also for a typical ELF file. And unlike bzip2, the cost is paid only during compression, decompression being hardly slowed down. Compression is significantly slower, but even on slowest architectures it is a tiny fraction of the time needed to build. On my cell phone compression of large files takes ~13 sec/MB, decompression 0.3 sec/MB (gzip 0.18 sec/MB). You can recompress arm packages on that 8 core amd64 machine nearby, too: ar x $PKG data.tar.gz && (gzip -cdf data.tar.gz|xz >data.tar.xz) && ar d $PKG data.tar.gz && ar r $PKG data.tar.xz Let's grossly overestimate and say that, if we include large data-only packages, the overall build time would increase by 1%. What would be the gains? HALVING[1] mirror usage, both disk and bandwidth, halving the time it takes to download a package, halving the number of CDs, enabling desktop users (ie, least technical) to install from a single CD, no noticeable slowdown of dpkg runs. Repacking the whole archive may be somewhat disruptive, but with this kind of massive gains, I'd say it's worth it even at this time in the release cycle. [1]. Measured on gimp + gimp-data, I'm repacking CD1-i386 as we speak to get a larger sample, on purpose on a slow machine to dismiss concerns about certain systems not being able to handle this. -- 1KB // Microsoft corollary to Hanlon's razor: // Never attribute to stupidity what can be // adequately explained by malice.
signature.asc
Description: Digital signature