Re: gzip --rsyncable via pigz?

Paul Eggert Thu, 13 Jun 2013 12:42:53 -0700

On 06/13/13 10:47, Mark Adler wrote:

> How do you know it's better?


I'm afraid I got that info from the rsyncable-related comments in the
pigz source code, so I don't have independent confirmation.

> pigz is designed to produce exactly the same output regardless of
> whether multiple processors are used or not.

That's important, but I worry that people will run into problems when
doing regression testing with the old version of gzip versus the new
one (assuming the new GNU gzip uses pigz).  The byte outputs will
differ, and this may cause problems.

Perhaps people wouldn't complain so much if the new GNU gzip defaults
to the pigz algorithm with a blocksize of 256 K, as that should give
better compression than traditional gzip, if I understand your
examples correctly.

One hacky way around the compatibility issue would be to keep the
current gzip source code, and to use the old gzip implementation
unless the user specifies a new option like -b or --rsyncable.  That
would maintain compatibility, but at a painful maintenance cost.

Other possibility might be to add an option to pigz to cause it to not
break the input into blocks that can be recombined on byte boundaries,
so that the pigz output is byte-for-byte the same as traditional
gzip's (albeit at a cost -- pigz won't parallelize if this option is
used).  I don't know whether this possibility is a reasonable
suggestion, though, as I don't know the pigz internals.

Re: gzip --rsyncable via pigz?

Reply via email to