Re: rsync algorithm improvements

2000-12-28 Thread John Langford
>At the risk of boring other readers, I'm curious what numbers you were >getting during your test - I just tried a --stats myself on a 51MB rsync version = 2.4.4 Working with a 100MB file and doing a null sync, I see: rsync --stats -e ssh -a --block-size=64000 says: wrote 9866 bytes read 6659 b

Re: rsync algorithm improvements

2000-12-28 Thread John Langford
>(no compression at all) you'd have to transmit 6-6.6MB of data - how >do you arrive at 20MB? I ran rsync --stats on two identical files of size 100MB with a 64KB block size and extrapolated to 20GB. The files themselves are incompressible. >That's sort of what I was getting at.. for example,

RE: rsync algorithm improvements

2000-12-28 Thread John Langford
Ok, I think I figured out how to combine the algorithms. Start with a base (like 8 or 256) and a minimum block size (like 256 bytes). The improved rsync will use the old rsync algorithm as a subroutine. I'll call the old rsync algorithm orsync and the new one irsync. Instead of actually transm

RE: rsync algorithm improvements

2000-12-28 Thread John Langford
> Can you expand on this further? It seems to me that to determine that The recursive algorithm which does a binary search for changes works like: 1. If the size is smaller then 128 bytes then send it over else Each side does a strong checksum If they are the same, then do nothing

rsync algorithm improvements

2000-12-24 Thread John Langford
I recently read through the rsync technical report and discovered that rsync doesn't use the algorithm that I expected. I expected an algorithm which does a binary search for the differences resulting in a network utilization of O(min(log(F)*C,F)) where F is the filesize and C is the number of ch