Paul Slootman <paul+rs...@wurtel.net> wrote: > On Wed 18 Dec 2013, Kevin Korb wrote: > > Also, rsync's -c is rather dumb as it computes checksums for files > > that have different sizes so they can't possibly be the same and it > > computes checksums for files that only exist on one end and therefore > > has nothing to compare them to. > > The list of files on the source is generated and transferred to the > destination before rsync knows that the destination file is different. > > To make rsync checksum only the files with same size would mean changing > the filesystem scan to a two-pass thing (send the list of filenames plus > their sizes, wait for the destination to tell you what files need > checksumming, do that and send the filenames again, now with checksum > data), and retransferring file metadata again.
That seems to imply that avoiding unnecessary checksum calculations would double the protocol overhead, which I think is overly pessimistic. IIUC, the current protocol, without checksums, amounts to (greatly simplified): Sender Receiver I have (name1,meta1) (name2,meta2), ... I need #s 3,8,13, ... Send requested files Write files to destination as received To do checksums only when needed: Sender Receiver Checksums have been requested. I have (name1,meta1) (name2,meta2), ... Based on sizes, I know I need #s 3,13, ... and I may need (#5,checksum5), (#8,checksum8), ... Send known-needed For each "may need" compute checksumN if match send "#N matches" else send "#N no match" send #N data Write files to destination as received If that is "a two-pass thing", the current protocol must involve 1.9 passes :) -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html