On Thu, May 31, 2001 at 06:50:01PM -0700, Wayne Davison wrote:
> I've been doing some testing where I trigger the rsync hang I talked
> about in my previous email (where the redo pipe to the generator process
> fills up and causes the receiver to deadlock). This bug is easy to
> trigger on a local-to-local rsync copy if I change receiver.c to retry
> every file during the first phase. My previous patch made it much
> harder to cause a deadlock, but was not totally effective -- it was
> still possible for an io_flush() in the generator (which happens prior
> to the read of the f_recv pipe) to block, allowing the receiver to
> deadlock trying to write to the generator.
>
> In order to fix this it looks like we need to make the generator keep
> the f_recv pipe empty while it is trying to write to the sender. The
> appended patch accomplishes this by changing io.c to allow an input fd
> to be registered and then monitored whenever we're writing data (through
> a simple extension of the existing select() call). The data is simply
> buffered up and used when generate_files() reads the ints (I changed
> readfd(), which is called by read_int()).
>
> With the appended patch I have not been able to get rsync to hang
> anymore. This patch relative to the current CVS. You should do a
> "make proto" after applying this patch.
A couple questions about your patch:
This replaces your previous patch, it is not in addition to it, right?
I' pretty sure it's a replacment but I just want to be sure.
Were you testing with -W on your local-to-local copy? If you were testing
against the current rsync in CVS then it would have been the default. I
don't know under what conditions rsync will normally do the "retry" you
refer to, or whether or not it ever happens with -W.
I hope Tridge will take a look at this one because he's the one who knows
the most about that area of the code.
- Dave Dykstra