On Tue, Dec 13, 2011 at 3:51 PM, Alan M. Carroll < a...@network-geographics.com> wrote:
> Tuesday, December 13, 2011, 3:30:06 PM, you wrote: > > > No other thread can call vc->do_io_close if they don't have the pointer > to > > it. > > Turns out that at least other thread does have a pointer. > Then they should not. > > > It is up to the protocol engine to ensure that all pointers are clear. > > Still not clear on how the protocol engine clears pointers in other > threads. > All the pointers are covered by the same mutex. Take the mutex. Clear the pointers. That is how multi-threaded programming works. > > > That means that you need to ensure that even if there are two different > > classes in two different threads that the pointers to the NetVC are > stored > > under the same instance of Mutex. > > That doesn't help, unless one of the threads holds the mutex the entire > lifetime of the object. If at any time one (non-NetHandler) thread drops > the mutex, then the other (NetHandler) thread can modify the VC to a state > where using the pointer will cause a crash, such as by swizzling the VFPTR, > or setting the nh member to zero (which ironically crashes on the mutex > check in do_io_close), or even re-allocating it so the wrong connection is > closed. Preventing simultaneous access isn't effective if single access > processing can leave the target of the pointer in an undetectably bad state. > The pointer is only swizzled AFTER do_io_close is called. You have to clear all other pointers to the object BEFORE calling do_io_close. 1. Create a Mutex. 2. only access pointers to NetVC under mutex 3. Release Mutex 4. Do what you want but DO NOT call do_io_close 5. Take out Mutex 6. Clear all pointers to NetVC (except the one you are holding) 7. Call do_io_close and clear that pointer. 8. DONE, safely, just as it has worked since 1996. > > > So you clear all pointers except > > the last one, call vc->do_io_closed() then clear the last via vc = NULL > and > > you are done. > > Again, the problem is the implementation mechanism by which those pointers > are cleared. If only there were some technique were you could, say, keep a > count of the number of outstanding pointers and only call do_io_close when > the last one was cleared. > The concept is called "understanding ownership". The NetVC is not some shared thing. It is a thing that should have one single owner. That owner should know where all the pointers to the object are. That owner should not invalidate the object while there are outstanding pointers. I know you have written code in C. This is just like that. You MUST clear all pointers in C before calling free() otherwise when you access through the pointer you will be touching free'd memory. This is the exact same thing. Exactly the same as any program written with pointers. It is no harder. Yes, it is multi-threaded, but it isn't rocket science. You just have to have exclusive access to any shared state. That is how parallel programming works. I don't know how much clearer I can be. If you are really having a problem with this I am going to have to go back through your checkins and see what changes might have been motivated by such a fundamental lack of understanding of parallel programming. This is very worrying. > > > This is multi-threaded programming 101. If you can't handle this then > you > > can't write multi-threaded code. > > I'll just re-assign all my bugs to you and zwoop, then, since I appear to > lack even basic programming knowledge. > You really don't get the concept of exclusive access to shared state and clearing pointers before invalidating an object???? I just can't believe that. We must have some sort of misunderstanding. > > > It sounds like there is a bug in the way that the HTTP sessions are > > handled. I need to look at the stack traces and see if I can figure out > > where the bug is. I you have some way to reproduce the problem that > would > > be great too. > > Set up forward transparent proxy and drive traffic through it. Prior to > 3.1.1, we would get a crash every few hours. bcall and weijin have seen the > same crash. > > > Smart pointers are a way of hiding the bugs, not a solution. > > Some one should let the C++ Standards Committee know. > As I have very clearly stated before, shard_ptr (Ptr) pointers are to be used when there is no clear owner to an object, e.g. Mutex, IOBufferBlock etc. That is when the object is truely "shared". The are not to be used when there is an owner. This is the case in Traffic Server and has been since the beginning. It is also the case at my current employer who as many thousands of programmers working very successfully with this convention. There are very few shared_ptr (Ptr) uses in TS or shared_ptr<> in the vast amount of parallel code have worked with because of this.