----- Original Message ----- > I have been thinking about this a lot and I think Alan is right. > The > locking design of ATS is not appropriate > for an Apache project. The design worked when it was a few full > time > people sitting within feet of each > other all day (and chunks of the night) but the world has moved on > and we > should move to a locking and threading > model which is more robust and fits with the expectations of a larger > fraction of the community. > > I want to apologize for wrapping you all into my long running > frustration > with trying to keep this brittle system stable > and open up discussion on how we can make it more stable, robust and > easier > to develop in. > > It is abundantly clear that if anyone has to go to the lengths that > Alan > has been forced to to try to make this system > work under load that it is the systems fault. > > So, here is my proposal. > > The old locking system was based on TryLocks which could not be taken > forceably. Moreover it depended on > very subtle knowledge of which bits of the various data structures > where > protected by which locks. This is clearly > not sustainable, nor is it necessary any longer. Modern threading > systems > work well with larger numbers of threads > and fine grain locking. > > So, let's change to the more conventional model with fine grained > locks > which protect data structures which are > clearly labeled with the lock that protects them and have external > APIs > which enforce that protection. These > locks will be just taken in the standard manner, and we will have to > ensure > that the data structures are > sliced so as to minimize lock contention in the standard manner. > > Let's also have a "Transaction" object (essentially our current Mutex > with > additional tracking and book keeping) > and an explicit mechanism for associating resources owned by > Processors > (e.g. NetVC) with a Transaction and for passing > resources (e.g. a NetVC) from one Transaction to another and for > returning > the resource to the Processor when it is no longer required > (close/free/release). We can also use proxy smart pointers and > encapsulation in debug mode to test that the ownership rules are > being > obeyed correctly and that "stale" pointers are not being accessed > (i.e. > after the resource has been released). > > I believe that the problems we are currently seeing turn on an even > more > subtle issue when the ownership of a NetVC > is passed from one transaction to another via the session manager. > Getting that code to work stably required many > a careful negotiation and resulted in something which is clearly very > brittle and not maintainable. > > I hope that we work through this and end up with a system which is > substantially more maintainable easier to develop in.
+1 > Thanx > john > > > On Tue, Dec 13, 2011 at 10:30 PM, Alan M. Carroll < > a...@network-geographics.com> wrote: > > > Tuesday, December 13, 2011, 7:00:42 PM, you wrote: > > > > >> > No other thread can call vc->do_io_close if they don't have > > >> > the > > pointer > > >> > to it. > > >> Turns out that at least other thread does have a pointer. > > > Then they should not. > > > > Great. Now explain it to the compiler. Let me know when you've done > > that, > > I'll be going back to work on IPv6. > > > > > If you are really having a problem with this I am going to have > > > to go > > back > > > through your checkins and see what changes might have been > > > motivated by > > > such a fundamental lack of understanding of parallel programming. > > > This > > is > > > very worrying. > > > > I think if you are worried, you should definitely go back and > > check. > > > > > > > -- Igor Galić Tel: +43 (0) 664 886 22 883 Mail: i.ga...@brainsware.org URL: http://brainsware.org/ GPG: 6880 4155 74BD FD7C B515 2EA5 4B1D 9E08 A097 C9AE