On Thu, May 30, 2002 at 03:35:05PM -0700, jw schultz wrote: [...] > > There is a patch available to gzip to add an option --rsyncable that's > > supposed to make it work better with rsync. It's been put into the > > "patches" directory for the next release of rsync, or you can get it at > > > > http://rsync.samba.org/ftp/unpacked/rsync/patches/gzip-rsyncable.diff > > I took a quick look at this patch and i think it does what i expected. > It resets the compression algorithm after each 4KB of > compresstext. This means that if you change 1 byte early in > the file it might or might not affect the blocks later on. > The reason for the equivication is that if the change alters > the compression ratio the savings are gone.
If that is how it works, and I think you are right, then it would only work for the smallest of cases, rendering the gzip-rsyncable patch worse than useless for the vast majority of cases. Regular resets hurt the compression ratio. Resets must occur at the same begin/end boundary points of an unchanged sequence of uncompresstext for the resultant compresstext to be unchanged. The only changes that will result in resets occuring at the same boundary points for any unchanged text following the change _must_ result in compresstext that is an exact multiple of 4KB. This means any insertion/deletion/replacement must not change the size of the resulting compresstext unless it is by an exact multiple of 4KB. I would guess that the number of changes meeting this criteria would be almost non-existant. I suspect that the gzip-rsyncable patch does nearly nothing except produce worse compression. It _might_ slightly increase the rsyncability up to the point where the first change in the uncompresstext occurs, but the chance of it re-syncing after that point would be extremely low. I tried to think of a way of doing this so that it would eventualy re-sync, with things like resets every <some-prime> bytes so that the reset window moves, but the problem is the source and target reset windows must move together for it to work, so any scheme that moves the reset window into sync will also move the window _out_ of sync. I don't think it is possible to come up with a scheme where the reset windows could re-sync after a change and then stay sync'ed until the next change, unless you dynamiclly alter the compression at sync time... you may as well rsync the decompressed files. -- ---------------------------------------------------------------------- ABO: finger [EMAIL PROTECTED] for more info, including pgp key ---------------------------------------------------------------------- -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.tuxedo.org/~esr/faqs/smart-questions.html