On Mon, Feb 09, 2004 at 09:14:06AM -0500, Jason M. Felice wrote: > I got the go-ahead from the client on my --link-by-hash proposal, and > the seed is making the hash unstable. I can't figure out why the seed > is there so I don't know whether to cirumvent it in my particular case > or calculate a separate, stable hash.
I believe the checksum seed is meant to reduce the chance that different data could repeatedly produce the same md4 digest over multiple runs. If a collision happens the hope is that a different checksum seed will break the collision. However, my guess is that it doesn't make any difference. Certainly adding the seed at the end of the block won't change a collision even if the seed changes over multiple runs. File MD4 checksums add the seed at the beginning, which might help breaking collisions, although I'm not sure. Wayne Davison writes: > There was some talk last year about adding a --fixed-checksum-seed > option, but no consensus was reached. It shouldn't hurt to make the > seed value constant for certain applications, though, so you can feel > free to proceed in that direction for what you're doing for your client. > > FYI, I just checked in some changes to the checksum_seed code that will > make it easier to have other options (besides the batch ones) specify > that a constant seed value is needed. I would really like a --fixed-csumseed option become a standard feature in rsync. Just using the batch value (32761) is fine. Can I contribute a patch? The reason I want this is the next release of BackupPC will support rsync checksum caching, so that backups don't need to recompute block or file checksums. This requires a fixed checksum seed on the remote rsync, hence the need for --fixed-csumseed. I've included this feature in a pre-built rsync for cygwin that I include on the SourceForge BackupPC downloads. Craig -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html