On Tue, Jul 02, 2002 at 05:51:58PM -0600, Joel Votaw wrote:
> 
> Attached is a patch that implements compressing output files as they're
> written to disk, uzing zlib.  Thus far I've only used it with
> synchronizing directories on a single machine.

This certainly would be useful once reliable.  Be handy for
dirvish and other backup tools.

> 
> What seems to work / what's done:
> 
>       - Synchronizing directories with all files in the target
>         directory gzip'd.  Files seem to contain the correct data.  Use
>         the option "--gzip-dest".

Should also have a --gzip-src option to allow reciprocal
transfers.  Comments in the patch mention this i notice.

>       - Only transferring files whose checksums are different.
>         Destination files are gunzip'd before their checksums are
>         calculated.
> 
>       - Added an option "--ignore-sizes", since there is no easy way for
>         the receiver to know the uncompressed size of the files it
>         already has.  For now you have to use --checksum to be sure...

This option shouldn't be necessary once you extract the size
from the internal gzip file structure.

> 
>       - Added gzio.c from the latest zlib distribution so we can call
>         gzwrite() etc.
> 
> What remains to be done / problems:
> 
>       - Needs more testing, especially with remote clients / servers.
> 
>       - Batch files are not compressed.

Huh?  Please explain what is a "batch" file and why it doesn't
get compressed.

> 
>       - Reading compressed files should be implemented in a more generic
>         fashion, perhaps in map_file() and its cousins.  I started
>         working on this but saw that changing map_file() et al. could
>         have far reaching consequences, so I took the easy way out: I
>         just changed the one routine I cared about for now.
> 
>       - Add documentation of new options to manpages etc.
> 
>       - Find a way to write down the uncompressed file sizes on the
>         receiving side.  Perhaps the least-bad way to do this would be
>         append some rsync-specific data, including uncompressed size, to
>         the end of the gzip'd files.  The receiver could read this in on
>         future runs when it needed to.  Gunzip'ing the file from the
>         command-line would work but would give a "ignoring trailing
>         garbage"  kind of error.
> 
> What I've done so far isn't pretty, but I thought I'd send it in in case
> someone else finds it useful.
> 
>       -Joel

It seems (to me) a reasonable start so far.
The comments show some foresight re bidirectional plans and
support for other compression libs and levels.

I don't know if i'd support multiple compression libs but if
you do might i suggest calling the option --zip-dest
and have it take an argument to specify the compression
library? ie --zip-dest (bzip2|gzip)[=(1-9)]

In any case i would make --gzip-dest take an optional
argument for specifying the compression level right away.
Also downgrade the default level to 6 as the speed penalty
for level 9 is seldom worth the marginal compression increase.

One extra issue to consider is that accidentally leaving off
the --gzip-* option would really mess things up (imagine
restoring /usr).  Long term a sanity check might be in order
with a way to override. 


-- 
________________________________________________________________
        J.W. Schultz            Pegasystems Technologies
        email address:          [EMAIL PROTECTED]

                Remember Cernan and Schmitt

-- 
To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.tuxedo.org/~esr/faqs/smart-questions.html

Reply via email to