Re: TS-857, TS-934, TS-1031

Alan M. Carroll Mon, 12 Dec 2011 07:22:17 -0800

Friday, December 9, 2011, 5:17:56 PM, you wrote:

> NetHandler::mainNetEvent doesn't lock the VCs because it doesn't need to
> until it wants to access the internal data at which time it locks it.
>  Therefore the internal data is protected by the NetVC mutex.  However, the
> existance is NOT protected by that mutex.


That's not what I have seen in debugging runs. i need to double check the 
actual call stack but I see calls down through read and in to close that don't 
lock the VC.

> The NetHandler lock is used to protect the allocation, deletion of the
> NetVC.  The owner of the NetVC lock can set the 'closed' flag, but can't
> delete the NetVC unless they own the NetHandler lock.

Sure it can, it just calls do_io_close. I direct your attention to 
HttpServerSession.cc:do_io_close which does precisely this. I have attached a 
diagram of this call chain, derived directly from an actual stack trace.
 
> A well functioning
> protocol engine would hold the NetVC mutex,. remove all references to the
> NetVC  then call do_io_close().

But isn't it the NetHandler mutex that controls de-allocation? Holding just the 
NetVC mutex and calling do_io_close doesn't follow that protocol.

> It looks like there are crashes because someone is calling do_io_close
> without ensuring that all the holders of pointers to the NetVC have been
> cleared.  Once do_io_close() is called, the NetVC is dead, it can't be
> accessed any more because the NetHandler can access the "closed" flag
> WITHOUT the NetVC lock, but WITH the NetHandler lock and delete the NetVC.

But how is that to be done? In the case I attached, the NetVC is possibly 
present in a ready queue of a NetHandler which is different from the NH that is 
invoking do_io_close. How can it find the correct NH and remove the NetVC from 
those queues? One must also be concerned about deadlock, where two threads try 
to close the NetVC, each holding the NetHandler mutex the other thread wants.

> There are a variety of ways of throwing smart pointers at the "problem"
> which basically confuse when the object is actually live and let people
> continue to hold pointers to dead/closed objects in buggy code.

No, it's using them to help distinguish between "live" and "de-allocated". It 
seems to me to be the least costly / complex way for thread 1 to in effect cull 
entries from thread 2's internal queues. With smart pointers thread 1 doesn't 
have to know which thread is thread 2, or even if it exists. If you have an 
alternate coordination mechanism, please suggest it.

> This doesn't change the fact that if they made sure not to hold stale
> pointers, there wouldn't be a problem in the first place.
 
Yes, but the question is, how can a thread make sure to not hold a stale 
pointer to an object which can be closed / de-allocated in another thread? 
Smart pointers make that simple without deadlock and with minimal lock 
contention.

Re: TS-857, TS-934, TS-1031

Reply via email to