Re: Linux 2.2.16 through 2.2.18preX TCP hang bug triggered by rsync

Martin Pool Fri, 16 Mar 2001 18:44:39 -0800
On 28 Feb 2001, Martin Pool <[EMAIL PROTECTED]> wrote:

> > What I don't see is how we could recode this to avoid the zero window
> > without losing a lot of the pipelining advantage we have now. Going to
> > a more traditional request/response model in rsync would certainly
> > make TCP like us but would incur a huge penalty in latency.
> > 
> > As we are currently working on the design of rsync 3.0 it would be
> > good to get suggestions now on brilliant ways of solving this
> > problem. (please don't suggest opening a second control socket!)

Here's my current idea: it may be naīve about the TCP bugs, but I hope
it will be more reliable and still keep both directions on the network
maximally full.

I don't think we really need multiplexed streams: all error messages
relate back to a particular operation which has failed, although at
the moment they're not construed that way in the code.

So, I'd like to change to a request-response protocol like POP3,
HTTP/1.1 or SMTP, where the client sends commands and gets response
codes which contain a status, possibly body data, and possibly an
error message.  Perhaps we can allow the body data to be sent in
several chunks so that responses that fail halfway through (e.g. file
truncated or IO error) can be truncated.   If there's an error on the
server, then it should result in an error response.  I'm imagining
a fairly concise encoding format, perhaps ASCII, with commands like

 * authentication conversations modelled on PAM

 * choose rsync module

 * mkdir

 * set permissions/acl

 * get rsync signature

 * apply patch

 * list directory

 * make link

 * turn on gzip

At first in rsync3, we can just do this in blocking mode, like most
HTTP or POP implementations where the client sends the command, then
does blocking reads to get the response.

I think we could then just have the client start pipelining requests,
by using select() on the socket and a work queue to keep track of
outstanding requests.  In some cases where there are dependencies
between requests, such as creating a directory and then the files
inside it, we'll have to compose the protocol so that all the requests
can issue before the first one completes.  Also, for example, we might
have to keep issuing requests for signatures until the first one
arrives back, and then start sending its delta.  The server of course
doesn't have to be nonblocking, but can just handle requests as they
arrive.

So, I think this will keep the send and receive sides full as much of
the time as possible, while still allowing TCP to keep the two sides
in loose synchronization.

To me, this seems like a fairly conservative use of sockets, and so I
hope it won't encounter more of these kinds of bugs.  Is it going to
be a problem (with or without SSH) to have e.g. the ->server side of
the socket full while the server is trying to write, or vice versa, if
the client doesn't fork?

-- 
Martin Pool, Human Resource
Linuxcare. Inc.   +61 2 6262 8990
[EMAIL PROTECTED], http://linuxcare.com.au/
Linuxcare.  Putting Open Source to work.
Re: Linux 2.2.16 through 2.2.18preX TCP hang bug triggered by rsync

Reply via email to