I can't address the algoritm questions but I'll tell you that we had a tremendous improvement is speed when we switched to a newer version of rsync.
We are using it (in this case) to rsync our oracle files to a separate partition on the system cpu. > I'm using rsync to copy some large (>1GB) oracle datafiles. I've noticed > that sometimes it transfers some of the files twice. > > Some earlier posts to this list that I saw in the archives seemed to > indicate that this is a problem with the rsync algorithm itself when > dealing with large files. Some of the mails seemed to indicate that this > can be mitigated by using larger block sizes, though there were some > caveats that increasing block size without increasing checksum size > might cause more hash collisions. > > My questions: > > 1) Can anyone explain the problem to me in layman's terms. Is the > initial bad transfer due to hash collisions? > > 2) If I'm transferring files that are 1-2GB, would increasing the > block-size parameter to 8k or so help here? Or would I be creating more > chances for hash collisions since I can't increase the checksum size? > > 3) I'm using 2.5.5 (yeah, ancient I know, I'll be upgrading it soon). > Are later versions better at dealing with this problem? > > Any help is appreciated! > > Thanks, > Jeff > > > -- > To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync > Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html