Not in the least. The only checksum that guarantees that two files are identical is one from which the entire file can be regenerated in only a single way, in other words, some form of compression. If you want to send the whole file, that's fairly straightforward. Rsync is a way of optimizing the process within certain limits. If a 1/whatever it is with those sums is not good enough, don't use rsync.
Tim Conway [EMAIL PROTECTED] 303.682.4917 Philips Semiconductor - Longmont TC 1880 Industrial Circle, Suite D Longmont, CO 80501 Available via SameTime Connect within Philips, n9hmg on AIM perl -e 'print pack(nnnnnnnnnnnn, 19061,29556,8289,28271,29800,25970,8304,25970,27680,26721,25451,25970), ".\n" ' "There are some who call me.... Tim?" "Berend Tober" <[EMAIL PROTECTED]> Sent by: [EMAIL PROTECTED] 04/17/2002 06:52 AM To: [EMAIL PROTECTED] cc: (bcc: Tim Conway/LMT/SC/PHILIPS) Subject: Non-determinism Classification: Is anyone else concerned about the fact that rsync doesn't guarantee to produce identical file copies on the the target machine? Don't get me wrong in sounding critical because I think that rsync is a great example of how software should be written. (I often make the observation, as I learn more about Linux, and inevitably find myself comparing open source applications to Microsoft products, that the people that wrote unix way back when at AT&T Bell Labs REALLY knew what they were doing. I also have the same attitude toward the developer and maintainer of rsync.) But the "Technical Report" at http://rsync.samba.org/tech_report/tech_report.html states that: "If the two strong checksums match, we assume that we have found a block of A which matches a block of B. In fact the blocks could be different, but the probability of this is microscopic, and in practice this is a reasonable assumption." Is that good enough? The statement, I believe, refers to some analytical estimate of the chance that the check-sums might match despite having different source files for comparison, but has anyone done empirical work to verify the we can pretty-much count on getting reliable file copies on the target? And how does this small probablity of file corruption compare to, say, using a full file transfer or copy? In the latter case, you might be tempted to think there is zero probablity of file corruption, but if you think of any data transfer as sending a digital signal through a noisy communication channel, there must be some way to quantify the realiability of cp verses rsync. I'm not sure that I have all the skills to do this analysis, but I'd be interested in seeing it done. Regards, Berend Tober -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.tuxedo.org/~esr/faqs/smart-questions.html -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.tuxedo.org/~esr/faqs/smart-questions.html