The core problem is that VC connection locking does not work. VCs are 
deallocated while still in use which leads to crashes. In fact, the general 
handling of VCs in NetHandler::mainNetEvent doesn't lock the VCs at all, so the 
cross thread access can't be safe. Moreover, because all the VC references are 
stored as raw pointers, there's no good way to safely access them across 
threads or even detect deallocation. I tried putting in generation numbers but 
that was insufficient (you can still get crashes because the virtual function 
pointer has been swizzled by the deallocator before you even get to the VC). 
All the crashes in the listed bugs stem from VC locking failures leading to 
cross thread corruption.

I've been at quite a loss on how to fix this.

The two root causes, IMHO, are

1) VCs are ephemeral and stored purely by pointer, which means the accessor 
cannot have thread safe access. You can use pointers if your objects don't 
evaporate asynchronously. The use of raw pointers also means there's no way to 
detect dangling references.

2) Closing and deallocation are conflated, so that when a VC needs to be 
closed, it is also de-allocated. There are good reasons to get the close done 
ASAP but the de-allocation is not nearly a time critical.

I am looking at using smart pointers to alleviate these two issues. However, 
that creates problems because the current queue mechanisms require the use of 
raw pointers. AFAICT the point of these is to minimize allocation on the fast 
path. The circular buffer is an attempt to have as little allocation impact as 
possible, otherwise I would just use a standard allocate per element style.

Experimentally, the fix I did for TS-934 had a strong effect on reducing the 
problem, although it was not a complete solution. My attempt to expand it in 
TS-1031 did not end well, due to the lack of locks from NetHandler. At that 
point I was looking at a significant change, so I decided that I should try to 
do the best thing, rather than a minimalist hack which would not really be that 
much smaller a change.

Friday, December 9, 2011, 11:38:47 AM, you wrote:

> Could you be more specific about the problem (e.g. big numbers)?
>  VConnection locking works fine in all cases that I know of.  There are
> some issues with closing and deallocation with threading and the higher
> layers, but these could be solved in a number of ways.  Smart pointers is
> one (and probably not a bad one) although it isn't a panacea, just a
> mechanism.  Similarly, a circular buffer queue is just a different way of
> viewing the data structure.  Sure, it has some advantages, particularly in
> exposure to deallocation and locking, but it isn't a silver bullet either.
>  I'd like to understand the core problem.

Reply via email to