Re: TS-857 and finer grained locking

John Plevyak Wed, 14 Mar 2012 19:30:13 -0700

On Wed, Mar 14, 2012 at 8:22 AM, Alan M. Carroll <
a...@network-geographics.com> wrote:

>
> > There is, however, one situation where this simple and safe order of
> events
> > is not followed.  That is connection sharing to origin servers.  Here the
> > situations starts the same, but when the client is done with the
> connection
> > it does not issue a do_io_close(), and this is where the problems can
> begin.
>
> That's not my interpretation of the crashes. We tried various settings for
> connection sharing to no observed effect on crashing type or frequency. In
> fact all of the configurations I use for testing have connection sharing
> disabled.
>

"As far as I can tell the problem arises when the VCs in a
HttpServerSession are split across two threads"

This can only occur when there is some connection sharing or if someone has
introduced a thread switch in some other processor which triggers the OS
connection.  AFAIK the OS connection is initiated on the thread which has
the client connection and thus, without connection sharing, they should be
on the same thread.

>
> One might ask, why is HttpServerSession split across threads like that? I
> have no idea. But it seems to happen much more with forward proxy (note: I
> have only indirect evidence for that).
>

I question I have as well.   This should not be the case and is going to
cause performance problems.  That said, it should not result in a crash.

> > John Plevyak writes:
> > So, this patch.  What does it do?  It uses smart points to prevent either
> > of the two threads from making one particular change to the shared NetVC
> > that they are currently scribbling all over: that of deleting it while
> the
> > other is still running.  It doesn't prevent any of the other horrors, or
> > all other manner of crashes, race conditions and unexpected behavior,
> just
> > the one, deallocating.  It is a serious one, but not the only one.
>
> My current view is that this is the only problem, because in all other
> cases the locking is working.
>

My view is that this is only one of many failure modes, albeit the most
common one.  If the locking was working, then the client would clear all
pointers to the netvc and then call close() while holding the last pointer
in local storage and the crashes  you are seeing would be impossible as the
netvc would be free'd by the owning thread and all pointers would have
already been cleared.  The only way there can be a crash is if two threads
are holding the pointer in volatile memory, and the only way that can be
happening is if the pointers are not cleared or if the locks are not
working.

john

Re: TS-857 and finer grained locking

Reply via email to