Peter Hansen <[EMAIL PROTECTED]> writes: > Good point. When I wrote that I was picturing the form of compression > that a .tar.gz file would have, not what is actually used inside a > .zip file which is -- quite logically now that you point it out -- > done on a file-by-file basis. (Clearly to do otherwise would risk > your data and make changing compressed zips highly inefficient.)
Right, and yes, .tar.gz files are very problematic for such algorithms, such as rsync. In fact, there was a patch made available for gzip (never made it ito the actual package I believe) that permitted resetting the compression engine at selected block boundaries - thus effectively bounding the "noise" generated by a single change. The output would grow a bit since resetting the engine dropped overall efficiency, but you got a tremendous gain back in terms of "rsyncability" of the file. -- David -- http://mail.python.org/mailman/listinfo/python-list