Dear rsync experts,

I'd like to ask you a couple of questions.

***** 1. *****
If I'd like to send additional metadata for each block, what would be the 
easiest and less intrusive way to do this? The following seems to work:
diff -rupN ../rsync.git/token.c ./token.c
--- ../rsync.git/token.c        2015-11-03 18:21:36.264183118 +0100
+++ ./token.c   2015-12-26 03:43:09.043841052 +0100
@@ -226,8 +226,12 @@ static int32 simple_recv_token(int f, ch
        if (residue == 0) {
                int32 i = read_int(f);
-               if (i <= 0)
+               if (i <= 0) {
+                       if (protocol_version >= 32) {
+                               int32 j = read_int(f); /* additional metadata */
+                       }
                        return i;
+               }
                residue = i;
@@ -252,8 +256,11 @@ static void simple_send_token(int f, int
        /* a -2 token means to send data only and no token */
-       if (token != -2)
+       if (token != -2) {
                write_int(f, -(token+1));
+               if (protocol_version >= 32)
+                       write_int(f, -(2*(token+1))); /* additional metadata */
+       }
 /* Flag bytes in compressed stream are encoded as follows: */

Is the protection with protocol_version enough or it could be done better?

***** 2. *****
Sending blocks in sequential manner is perfectly suited for cases, where:
1) nothing in the old target file is at the same place as in the new source and
2) no consecutive matches longer than 1024 bytes can be found.
But what if there are many blocks at the same place and there are matches of 
dozens of consecutive blocks? Then this approach is no longer efficient. How 
about (at least theoretically) to rework only this part of the protocol 
(sending/receiving literal/token data) in the manner like follows?

Condition (*): if occurs, try to accumulate as much as possible of consecutive 
blocks, then send (1,2,..).

 * no matched data
  1) type of transfer (e.g., 1)
  2) offset to start writing to
  3) data length
  4) literal data
 * matched data
  1) type of transfer (e.g., 2)
  2) offset to start writing to
  3) offset to start reading from
  4) data length
 * data at the same offset
  1) type of transfer (e.g., 3)
  2) start offset
  3) data length
 * no data at all (zeros/holes)
  1) type of transfer (e.g., 4)
  2) start offset
  3) data length
The type of the transfer would be a single byte. Offsets and data length fields 
would of 64bit length. Or the versioning could be split in two parts, and an 
additional version of this protocol part will prepend the type of transfer.

What do you think? Theoretically and maybe practically? Due to protocol 
versioning it should be possible to change the protocol in an arbitrary 
fashion, since old code for the previous versions remains.

***** 3. *****
In order to test such changes for regressions, some tricky corner cases could 
occur, e.g. as described in "Extra writes with --inplace due to misaligned 
block matching" (
Are there any tests for this? Or how has it been debugged to ensure the issue 
is fixed?

***** 4. *****
Am I right, that due to sending the offset as int32, rsync is limited in 
synchronizing files of max 2TB (using default block size of 1024 bytes)? It 
sounds huge, but it is not actually anymore nowadays. Thus adding an additional 
argument for protocol update.

***** 5. *****
The traffic on the rsync mailing list dropped almost to zero. Indeed there are 
hundreds of opened bugs that got no attention, some important features missing 
and there is almost no development on this classic tool. Or legacy? Maybe a 
successor of rsync is already there being actively developed and maintained and 
I'm not yet aware of this? (Which renders attempts to tweak rsync obsolete.)

Thanks in advance,

Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options:
Before posting, read:

Reply via email to