On 12 Jun 1999, Adam Di Carlo wrote: > Jason, I hadn't read back on all this discussion. I caught up a bit > more. Sorry to have you hash this out again.
Oh thats OK, with any luck it has made things clearer for everyone :> > However, I personally would weigh in, in the case of an upstream > .tar.gz file which dpkg-source can handle and doesn't have other > problems (such as shipping binaries), in that case, I think it's > better to use the "pristine upstream archive" as you call it. And I > think this normative *recommendation* should be in Policy. > > Do you disagree with that point? Somewhat, a secondary goal here is to encourage increased compression to reduce the size of the source archive. Retaining poorly compressed pristine archives is contrary to that :< The way I worded the proposal was to justify recompressing source archives and provide some rules for when this can occure. How about this, can someone whip up a script that when run over a tree of .gz files will compute the size change when converted to .bz2 files and generate a report. The report should show the overall size change and a breakdown of the size change for files in some size ranges, say 0-100k, 100-500k, 500k-1000k, 1M-5M 5M-10M >10M - When completed I can let it grind away on samosa overnight Given that data we can see for certain what sort of savings we are talking here and exactly how much it will 'cost' us to have pristine archives. I ask for the size ranges because it may turn out that it is only a big win for big packages.. Some speculation, From the kernel we see: -rw-r--r-- 1 500 100 13827947 May 13 23:54 linux-2.2.9.tar.gz -rw-r--r-- 1 root root 11235732 May 13 16:54 linux-2.2.9.tar.bz2 Which is about 0.81 times the size when used with .bz2. We have about 1.0G of source files, so we might save about 200 meg which is 30% of a CD-ROM. I think there is a potential for this level of savings, particularly considering that most tar.gz files are not going to be compressed with -9 Jason