Re: TS-857, TS-934, TS-1031

John Plevyak Wed, 14 Dec 2011 10:18:53 -0800

It would be a serious undertaking, but it is clearly necessary and I think
it can be done incrementally... that is we can move to the new model within
the Processors and then transition the APIs.  Based on that, in the short
term we can take a more heavy handed approach to solving this problem and
just wrap some big locks around the problem areas then push them down (a la
multi-threading the FreeBSD kernel).   In the short term we will have some
contention problems, but that is better than crashing.


john

On Wed, Dec 14, 2011 at 9:52 AM, Leif Hedstrom <zw...@apache.org> wrote:

> On 12/14/11 10:28 AM, John Plevyak wrote:
>
>> The old locking system was based on TryLocks which could not be taken
>> forceably.  Moreover it depended on
>> very subtle knowledge of which bits of the various data structures where
>> protected by which locks.  This is clearly
>> not sustainable, nor is it necessary any longer.   Modern threading
>> systems
>> work well with larger numbers of threads
>> and fine grain locking.
>>
>> So, let's change to the more conventional model with fine grained locks
>> which protect data structures which are
>> clearly labeled with the lock that protects them and have external APIs
>> which enforce that protection.  These
>> locks will be just taken in the standard manner, and we will have to
>> ensure
>> that the data structures are
>> sliced so as to minimize lock contention in the standard manner.
>>
>> Let's also have a "Transaction" object (essentially our current Mutex with
>> additional tracking and book keeping)
>> and an explicit mechanism for associating resources owned by Processors
>> (e.g. NetVC) with a Transaction and for passing
>> resources (e.g. a NetVC) from one Transaction to another and for returning
>> the resource to the Processor when it is no longer required
>> (close/free/release).  We can also use proxy smart pointers and
>> encapsulation in debug mode to test that the ownership rules are being
>> obeyed correctly and that "stale" pointers are not being accessed (i.e.
>> after the resource has been released).
>>
>
> This sounds like a great idea! How do we get there? This would be a pretty
> significant undertaking, and touches pretty much all code. So, perhaps this
> is a target goal for v4.0? It sounds like we might need to try to have a
> "hackathon" event again somewhere? :)
>
>
>> I believe that the problems we are currently seeing turn on an even more
>> subtle issue when the ownership of a NetVC
>> is passed from one transaction to another via the session manager.
>> Getting that code to work stably required many
>> a careful negotiation and resulted in something which is clearly very
>> brittle and not maintainable.
>>
>
> Hopefully we can figure it out to at least make it not crash in the short
> term? I could quite possible be at fault here too, I added extra complexity
> by adding an (optional) per thread session pool. I hope that is not causing
> some of these problems?
>
> Cheers,
>
> -- leif
>

Re: TS-857, TS-934, TS-1031

Reply via email to