> On Aug 19, 2010, at 2:54 PM, Paul Eggert wrote: >> In that case I'm afraid that we need to give up on the goal of always >> providing a correct uncompressed length. At this point the gzip >> format is so widely used that an incompatible change to it would cause >> far more trouble than the relatively minor problem of gzip -l >> reporting the wrong length. Instead, it might be better to leave the >> format alone, and to change gzip -l so that it decompresses the data >> in order to report the uncompressed data length.
That would be a significant performance hit, as Mark also noted. At least in the file (as opposed to streaming) case, there are some alternatives: - Add a suboption (-ll? -l2? ) to do decompression-listing. - If just -l and compressed size is more than 250 MB, warn that the uncompressed size might be off by a multiple of 2^32 and note the secondary option. - If the compressed size is more than 4 GB, perhaps force decompression- listing, or else make the warning more definite (maybe even an error). Of course, I didn't even know there was a -l option, so feel free to ignore me... Mark wrote: > So I'm thinking we should put forth a format amendment. > However initially only decoding the format would be supported, > not creating it. We would let that simmer for, say, three years > for the updated gzip, pigz, and zlib to free range. Then let > loose versions that create the format once the decoding has a > wide distribution. Of course, some third-party apps will undoubtedly start writing the new format even before the updated standard is ratified... > I used to think that eventually this would all go away since > the gzip format would surely be supplanted by something else. > However even with .bz2 and .xz formats with better compression, > that simply hasn't happened. So now I'm beginning to think > that the .gz format will stubbornly persist at least until I die > (at which point I won't care anymore). Yup. gzip is an excellent compromise between speed (both sides) and compression ratio. bzip2 is completely pointless these days (-1 same as gzip -9 but much slower, -9 same as xz -1 or -2 but slower), and xz offers a decent improvement in compression ratio at close to an order-of-magnitude hit in compression speed and 2-3x hit in decompression speed (relative to gzip, that is). > So how would a format amendment be put forth? As an addition > to the existing RFC? As a new RFC? Who knows how to do that? I believe updated RFCs appear as new numbers--the e-mail format (RFC-822?) got updated at least once or twice, IIRC. I'm pretty sure the procedure is documented either in an RFC or somewhere on IANA's web site, but Glenn is the one who dealt with most of that, as I recall. I'll cc him (not that I'm volunteering him for anything, of course!). > By the way, I keep getting bounce messages from Peter's address, > so I don't think he's getting these emails. Yup, he's gone. I've removed him from the cc list. Greg