Jason, Wayne,
  As far as I know, rysnc transfers the files themselves using TCP, right?
  So rsync can be affected by latency - transferring a large file,
  it is subject to TCP's latency-effects, including slow recovery from
  a loss on large bandwidth*delay product (BDP) links.
  (Same with several small files- they all go through one TCP session).

I'll touch on:
1. TCP buffers
2. Intrinsic path loss rate, and help-from-parallelism (more under "4" also)
3. SSL buffers
4. Alternate stacks (changes TCP's growth, and loss-recovery algo's)

1. TCP buffers

I saw this when debugging file transfers from FermiNatlLab to Renater(France).
  If you have sufficient TCP buffers on both end-hosts (bw*delay),
  **and** have sufficiently low loss, TCP can fill the pipe.
  Jason's assertion that multiple rsync's in parallel can fill the pipe,
  make me lean towards one of the above as his core issue.

Wayne- you helped address the small-default-TCP-buffer issue by adding client-side
  buffer-size requests (thanks!).

  **If** Jason's OS does not autotune it's buffers, then you have provided
  the hook to allow him to set "large enough" buffers via CLI.
  (But if he's running current Linux, and has large enough max-TCP-buffers,
  then Linux should auto-tune up to the required amount w/o CLI intervention).
  I think Jason has experimented with this before; but for other folks,
  just remember bother sender's TX and receiver's RX buffers need to be
  "large enough".

2. Intrinsic path loss rate, and help-from-parallelism

  If you have enough loss to keep knocking TCP's rate down, then the
  usual fixes are to reduce the loss rate, or use multiple parallel TCP's,
  or use a "more aggressive" TCP stack (more below).

3. SSL buffers

  There is one other subtle possibility, that I also saw first hand.
  That is, if you send the files via ssh encryption...
   It turns out that SSL currently implements a windowing-mechanism,
** on top of TCP windowing **. It is fixed-size, you can't upsize it to account for
   large BDP.  That means that even if you have TCP buffers large enough,
  you still won't get  a full-pipe, because of SSL windowing.
  In the absence of a "fixed" ssl, parallelism can also work around this issue
(10 ssl/tcp flows can each work with 1/10 the full BDP buffer size, so 10 copies
  of the  fixed-size ssl buffers might work out OK).
  Chris rapier has been working on a fix for this, see:
   http://www.psc.edu/networking/projects/hpn-ssh/
  Note this affects anything using ssl, like scp, sftp, rsync -e ssh, etc.
   And sorry for the redundancy - I'd mentioned this in my post back in Nov'05.
  Installing this patch would "fix" ssl-windowing for any app that uses
  it, including rsync (though I've not used this fix myself).


4. Alternate stacks (changes TCP's growth, and loss-recovery algo's)

  Jason - I suspect you've upsized the buffers already
  (we had an ancient conversation about this...)
  If you're sending clear-text then it's not ssl-windowing.
  (But you were using a VPN at one time; and I'm still
  uncertain about possible buffer-like-windowing-artifacts from a VPN).

  Since parallelism fills your pipe, that leaves baseline loss
  as a likely candidate.
  Supercomputer centers often use parallelism to help
  get around TCP's loss/recovery behavior.
  It's kind of a hack(my opinion),  but it works (cf. gridtcp or bbftp).
  Spreads out the loss over several TCP flows, so only one backs off
  by 50%, rather than  the aggregate backing off by 50%.
  That's straightforward to confirm with iperf, and I think Jason
  has done that in the past.

  If you feel that's likely, and you "own" the link
  (i.e. your colleagues won't get mad at you if you dominate the link),
  I'd suggest trying one of  the alternate TCP congestion control
  stacks, many of which are built-into current Linux.
  Like BIC, or HS-TCP (or H-TCP, if it's there).
  These try to optimize for high bw*delay links, usually by
  climbing the rate-curve faster than Reno, and also usually
  recover faster after a loss. So if your core issue is non-congestive
  loss, the alternate stacks might improve things significantly.
  It's not the same approach as parallelism, but can give similar results.
And you can try it (if on Linux) w/o Wayne having to build in parallelism support.

Aside: if you use Microsoft Windows, note that Vista has TCP buffer autotuning,
   and an experimental Microsoft-research stack called Compound-TCP (CTCP).
  You could get on the Vista beta... ;-)

  Let us know what you learn - I suspect there are an increasing number
  of folks using rsync in a large bw*delay environment, and your
  experience will help them.

Best regards,
Larry
--



At 2:12 PM -0800 3/5/06, Wayne Davison wrote:
On Mon, Mar 06, 2006 at 09:22:04AM +1300, Jason Haar wrote:
 We have fat pipes and yet a single rsync session cannot saturate it
 due to the latency.

Rsync is not adversely affected by high latency connections because the
three processes (generator, sender, and receiver) all run concurrently
without waiting for any round-trip confirmations (until the end of the
entire file list).  If your link isn't being saturated, it should be due
to some other bottleneck, such as disk I/O, CPU, or just not being able
to rapidly find the files that need to be updated.

If disk I/O is the limiting factor, try using --whole-file (i.e. as
your transfer speed approaches/exceeds your disk I/O speed, rsync's
incremental update algorithm can take longer to find the differences
than it would take to just send the updated file).

It also helps to avoid things that laden the CPU, such as the --compress
(-z) option, and slow ssh encryption.  You can try switching ssh to
blowfish encrypt, or perhaps switch from ssh to rsh if CPU is your
limiting factor.

..wayne..
--
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
--
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html

Reply via email to