Chris Meadors wrote:
Rsync treats all files as binary. When finding changes it splits a file
into blocks, computes a checksum for each block and performs a
comparison between the sending and receiving side. Then it only sends
the blocks which have changed.
When dealing with a text file which has been appended to, like a log,
all the initial blocks are the same. But if the file is sorted, it's
possible only a few additional lines will disrupt most every block by
changing the start offsets through out the entire file.
It's actually more efficient than that !
It uses something similar to a rolling checksum to find throughout
the file. So in principal, you can add a short bit to the front of a
large file, or even chop a file up into chunks and rearrange them,
and it will still only transfer the changes.
Andrew Tridgell's research paper is available at
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.123.1530&rep=rep1&type=pdf
rsync is covered from section 3 onwards.
--
Simon Hobson
Visit http://www.magpiesnestpublishing.co.uk/ for books by acclaimed
author Gladys Hobson. Novels - poetry - short stories - ideal as
Christmas stocking fillers. Some available as e-books.
_______________________________________________
Help us build a comprehensive ClamAV guide: visit http://wiki.clamav.net
http://www.clamav.net/support/ml