On 06/13/13 10:47, Mark Adler wrote: > How do you know it's better?
I'm afraid I got that info from the rsyncable-related comments in the pigz source code, so I don't have independent confirmation. > pigz is designed to produce exactly the same output regardless of > whether multiple processors are used or not. That's important, but I worry that people will run into problems when doing regression testing with the old version of gzip versus the new one (assuming the new GNU gzip uses pigz). The byte outputs will differ, and this may cause problems. Perhaps people wouldn't complain so much if the new GNU gzip defaults to the pigz algorithm with a blocksize of 256 K, as that should give better compression than traditional gzip, if I understand your examples correctly. One hacky way around the compatibility issue would be to keep the current gzip source code, and to use the old gzip implementation unless the user specifies a new option like -b or --rsyncable. That would maintain compatibility, but at a painful maintenance cost. Other possibility might be to add an option to pigz to cause it to not break the input into blocks that can be recombined on byte boundaries, so that the pigz output is byte-for-byte the same as traditional gzip's (albeit at a cost -- pigz won't parallelize if this option is used). I don't know whether this possibility is a reasonable suggestion, though, as I don't know the pigz internals.