Hi, In DebConf12, I talked about xz compression for Debian packages(*). Now I'll talk about next step, suggestion for use xz with with result from some experiment.
*) http://penta.debconf.org/dc12_schedule/events/930.en.html ------------------------------------------------------------------------------ test environment (armel) ------------------------------------------------------------------------------ I used Netgear ReadyNAS Duo v2, armel arch machine for this test. see http://www.netgear.com/home/products/storage/prosumer/rnd2000.aspx It is based on Debian Squeeze, so the result will be the same in Debian :) # uname -a Linux nas-A0-96-88 2.6.31.8.duov2 #1 Mon May 14 18:35:20 HKT 2012 armv5tel GNU/Linux # cat /proc/cpu cpu/ cpuinfo root@nas-A0-96-88:/tmp# cat /proc/cpuinfo Processor : Feroceon 88FR131 rev 1 (v5l) BogoMIPS : 1599.07 Features : swp half thumb fastmult edsp CPU implementer : 0x56 CPU architecture: 5TE CPU variant : 0x2 CPU part : 0x131 CPU revision : 1 Hardware : Feroceon-KW Revision : 0000 Serial : 0000000000000000 # free total used free shared buffers cached Mem: 246820 139216 107604 0 2760 57460 -/+ buffers/cache: 78996 167824 Swap: 524268 360 523908 And I used libreoffice-core package (about 35MB) for the test, now it uses bz2 for package compression, and openclipart-png (old version, about 600MB). ------------------------------------------------------------------------------ results1 (libreoffice-core) ------------------------------------------------------------------------------ Okay? Here we go... # du -m * 1 control.tar.gz 35 data.tar.bz2 38 data.tar.gz 24 data.tar.xz 1 debian-binary 35 libreoffice-core_3.5.4-7_armel.deb # time gzip -d data.tar.gz real 0m7.253s user 0m4.980s sys 0m1.070s # time bzip2 -dfk data.tar.bz2 real 0m45.256s user 0m42.320s sys 0m2.000s # time xz -dfk data.tar.xz real 0m11.443s user 0m9.710s sys 0m1.450s size decomp-time without compression : 141MB - Default compression(gzip -9) : 38MB 7.3s Package option (bzip2 -9) : 35MB 45.3s xz (--arm --check=crc32 --lzma2=dict=64KiB) : 24MB 11.4s (--arm --check=crc32 --lzma2=dict=1MiB) : 22MB 11.0s (--arm --lzma2=dict=64KiB) : 24MB 12.5s (--arm --lzma2=dict=1MiB) : 22MB 12.0s (--lzma2=dict=64KiB) : 27MB 12.8s (--lzma2=dict=1MiB) : 25MB 12.3s ------------------------------------------------------------------------------ results2 (openclipart-png, it's arch:all and huge package) ------------------------------------------------------------------------------ # du -m * (snip) # time gzip -d data.tar.gz # time bzip2 -dfk data.tar.bz2 # time xz -dfk data.tar.xz (snip) size decomp-time without compression : 632MB - Default compression(gzip -9) : 607MB 48.7s bzip2 compression (bzip2 -9) : 611MB 6m52s xz (--check=crc32 --lzma2=dict=64KiB) : 604MB 2m09s (--check=crc32 --lzma2=dict=1MiB) : 601MB 2m12s (--lzma2=dict=64KiB) : 604MB 2m12s (--lzma2=dict=1MiB) : 601MB 2m11s ------------------------------------------------------------------------------ results3 (libreoffice-core by amd64 machine) ------------------------------------------------------------------------------ armel vs Intel Corei3 2.90MHz -> almost x5 than armel. size is 10% large. size decomp-time xz (--x86 --lzma2=dict=1MiB) : 25MB 2.7s ------------------------------------------------------------------------------ conclusion (half) ------------------------------------------------------------------------------ We should use xz compression instead of bzip2 at least. bzip is harmful for compressing debian package, so should drop it from support to check easier. Using xz is - smaller than gz and bz2, able to be cut 1/3 size - faster than bz2 and not much slower than gz (on armel arch, at least) 1.5 times slower than gzip gzip or xz? - cut 1/3 size = cut download time/traffic and repository size - slower 1.5 times = it takes more extract time when package is installed -> average download rate = almost 600KB/s -> download 35MB = 60 sec 24MB = 40 sec -> diff = 20 sec + 4 sec - 20 sec = -16 sec (if you use xz) ------------------------------------------------------------------------------ conclusion (rest) ------------------------------------------------------------------------------ I recommend to use xz ***by default*** (with appropriate option) on not only i386/amd64 but on ANY architectures. Increasing extract time can be ignore by decreasing download time and its only part of installation as Mike Hommey suggested "I/O is still more time consuming than CPU", and nothing worse than high cpu usage. We know some packages are better to use gzip, but it's an exception. Using xz is best choice for rest 99.99% of packages. We can deal with such exception by specifying gzip for that (e.g. openclipart-png). *** what's the best compress option for default? *** low CPU : --check=crc32 -> -10% time low memory : --lzma2=dict=64KiB (or -0) -> use 100KiB mem average CPU/memory : --lzma2=dict=8MiB (= -6 = default) use arch optimization? : Yes, if we can (*) -> -10% size *** how to find appropriate compression rate(1, 6 or 9) for xz? *** build your package with each option :-) I've proposed tiny hack for debhelper, with specifying environment variable, it creates each compression option - gz, 1, 6, 9, 1e, 6e and 9e. See http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=686048 ------------------------------------------------------------ *) tiny pseudo code arch=`dpkg-architecture -qDEB_HOST_ARCH` if [ arch = arm | armel | armhf | aarch64 ] // maybe set on_arch --arm elsif [ arch = powerpc | ppc64 | powerpcspe ] // maybe set on_arch --powerpc elsif [ arch = sparc | sparc64 ] // maybe set on_arch --sparc elsif [ arch = ia64 ] set on_arch --ia64 elsif [ arch = i386 | amd64 ] set --x86 fi -- Regards, Hideki Yamane henrich @ debian.or.jp/org http://wiki.debian.org/HidekiYamane -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20120828121018.9bc106d568e137356e37e...@debian.org