Chris Meadors wrote:

Rsync treats all files as binary.  When finding changes it splits a file
into blocks, computes a checksum for each block and performs a
comparison between the sending and receiving side.  Then it only sends
the blocks which have changed.

When dealing with a text file which has been appended to, like a log,
all the initial blocks are the same.  But if the file is sorted, it's
possible only a few additional lines will disrupt most every block by
changing the start offsets through out the entire file.

It's actually more efficient than that !
It uses something similar to a rolling checksum to find throughout the file. So in principal, you can add a short bit to the front of a large file, or even chop a file up into chunks and rearrange them, and it will still only transfer the changes.

Andrew Tridgell's research paper is available at http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.123.1530&rep=rep1&type=pdf
rsync is covered from section 3 onwards.


--
Simon Hobson

Visit http://www.magpiesnestpublishing.co.uk/ for books by acclaimed
author Gladys Hobson. Novels - poetry - short stories - ideal as
Christmas stocking fillers. Some available as e-books.
_______________________________________________
Help us build a comprehensive ClamAV guide: visit http://wiki.clamav.net
http://www.clamav.net/support/ml

Reply via email to