Thanks John and AMC, especially for AMC, he contribute a lot to solve
this problem. Thanks again.
I really should have looked more closely at what their posts as well. It
took a long time for me to remember how lock switching was supposed to
work. I think it is far too brittle and clearly it was poorly documented.
I am going to add some inline documentation and when I can break some time
awa
Looks good. I apologize to weijin and ming_zym for being too distracted to look
closely at those even though I meant to do so. Not being able to replicate the
problem even on old code bases made it problematic as well.
Sunday, March 25, 2012, 5:56:01 PM, you wrote:
> I found the problem and it
I found the problem and it has nothing to do with any of this. The
problem, as quite rightly pointed out by weijin is that when the closed
flag by one thread, another thread can delete the NetVC. This is expected
and desirable, however, unfortunately do_io_close() is accessing the "nh"
variable (
Saturday, March 17, 2012, 10:33:18 PM, you wrote:
> On Thu, Mar 15, 2012 at 11:01 AM, Alan M. Carroll
> wrote:
> Thursday, March 15, 2012, 9:26:33 AM, you wrote:
>> [The lock] is only de-allocated after the close() by which time all
>> references to that NetVC should have been dropped by the clie
I tried this patch and ran some load, w/o an assert. It would seem to
assert that all HttpSM transitions occur on the same thread which would
seem to imply that only one thread ever invoked a given HttpSM. Ideas?
john
diff --git a/proxy/http/HttpSM.cc b/proxy/http/HttpSM.cc
index 2f069ef..ec7f
RE: TS-857 commented on that with a patch.
RE: TS-1114, I commented on that. That is a serious bug. We need to get
that committed.
john
On Sat, Mar 17, 2012 at 8:48 PM, ming@gmail.com wrote:
> I think that is a event which is cancelled and destructed, but still
> running in other thread, t
I think that is a event which is cancelled and destructed, but still
running in other thread, the event id in our stack show the same issue
as TS-1114, with turnout to be a locking issue, where we should protect
all the vol open/write.
the event is free and the same place is filled with other data
This makes no sense at all then. If there is no sharing there should be
only one thread in play, so there can't be a thread A and a thread B. I'd
love to know where that other thread comes from. It seems like a deeper
problem. I added asserts a while back to ensure that a transaction stayed
ent
On Thu, Mar 15, 2012 at 11:01 AM, Alan M. Carroll <
a...@network-geographics.com> wrote:
> Thursday, March 15, 2012, 9:26:33 AM, you wrote:
>
> > [The lock] is only de-allocated after the close() by which time all
> > references to that NetVC should have been dropped by the client.
>
> Yes, I unde
Thursday, March 15, 2012, 12:39:08 PM, you wrote:
> On 3/15/12 10:58 AM, Alan M. Carroll wrote:
>> Thursday, March 15, 2012, 10:43:34 AM, you wrote:
This can only occur when there is some connection sharing or if someone has
introduced a thread switch in some other processor which trigg
Thursday, March 15, 2012, 9:26:33 AM, you wrote:
> [The lock] is only de-allocated after the close() by which time all
> references to that NetVC should have been dropped by the client.
Yes, I understand. What I do not understand is what actual, specific,
implementation mechanism I can use to ma
On 3/15/12 10:58 AM, Alan M. Carroll wrote:
Thursday, March 15, 2012, 10:43:34 AM, you wrote:
This can only occur when there is some connection sharing or if someone has
introduced a thread switch in some other processor which triggers the OS
connection. AFAIK the OS connection is initiated on
Thursday, March 15, 2012, 10:43:34 AM, you wrote:
>> This can only occur when there is some connection sharing or if someone has
>> introduced a thread switch in some other processor which triggers the OS
>> connection. AFAIK the OS connection is initiated on the thread which has
>> the client co
> Date: Wed, 14 Mar 2012 19:29:43 -0700
> From: jplev...@acm.org
>
> On Wed, Mar 14, 2012 at 8:22 AM, Alan M. Carroll <
> a...@network-geographics.com> wrote:
>
> >
> > > There is, however, one situation where this simple and safe order of
> > events
> > > is not followed. That is connection sh
On 3/14/12 8:29 PM, John Plevyak wrote:
On Wed, Mar 14, 2012 at 8:22 AM, Alan M. Carroll<
a...@network-geographics.com> wrote:
There is, however, one situation where this simple and safe order of
events
is not followed. That is connection sharing to origin servers. Here the
situations star
On Thu, Mar 15, 2012 at 7:05 AM, Alan M. Carroll <
a...@network-geographics.com> wrote:
> Wednesday, March 14, 2012, 9:29:43 PM, John Plevyak wrote:
>
> > My view is that this is only one of many failure modes, albeit the most
> > common one.
>
> I disagree because only in the close case is the lo
Wednesday, March 14, 2012, 9:29:43 PM, John Plevyak wrote:
> My view is that this is only one of many failure modes, albeit the most
> common one.
I disagree because only in the close case is the lock itself de-allocated. In
all other cases the locks continue to be valid. So while all the other
This looks to be superior to that patch, so I would be for replacing that.
I still see this as a quite complicated bandaid, but if it makes things
better in the short term, I am not opposed.
john
On Wed, Mar 14, 2012 at 11:44 AM, Alan M. Carroll <
a...@network-geographics.com> wrote:
> One thin
On Wed, Mar 14, 2012 at 8:22 AM, Alan M. Carroll <
a...@network-geographics.com> wrote:
>
> > There is, however, one situation where this simple and safe order of
> events
> > is not followed. That is connection sharing to origin servers. Here the
> > situations starts the same, but when the cli
I remember the commit had already reverted and I said the rescheduling
patch have problems in irc.
On Wed, 2012-03-14 at 13:44 -0500, Alan M. Carroll wrote:
> One thing that should be noted is that, regardless of whether this patch is a
> permanent fix, I think it is clearly superior to the curr
Wednesday, March 14, 2012, 10:57:17 AM, you wrote:
> according to your example timeline, can we tell that there is a path
> that httpSM dereference vc after it call vc->do_io_close?
For thread B, the mutex->acquire() dereferences the VC pointer to get the mutex
object. If you want a fuller chai
On Wed, 2012-03-14 at 10:22 -0500, Alan M. Carroll wrote:
> Tuesday, March 13, 2012, 11:46:15 PM, you wrote:
>
> > Here are my comments for what they are worth.
>
> > First, let me detail the issue this is trying to address.
>
> > The way that most clients work with VCs is via and Processor::ope
Tuesday, March 13, 2012, 11:46:15 PM, you wrote:
> Here are my comments for what they are worth.
> First, let me detail the issue this is trying to address.
> The way that most clients work with VCs is via and Processor::open_
> which calls back with and OPEN event at which point they set VI
I have not read this patch carefully so I have no idea that the patch
solved the problem or not. I just have two question:
1) do we find the cause of the crash? (how to trigger the crash, in what
situation)
2) is it worth using smart pointer to prevent the crash? (maintain the
design principle of
Alan:
I think we have make many effort on this issue already, and fairly
sure that we are closing to the root cause of it. From my team side,
we'd like to keep tracking it instead of hiding the issue. please
holding back the commit, and give us more detail on the crashing back
traces as much as p
Here are my comments for what they are worth.
First, let me detail the issue this is trying to address.
The way that most clients work with VCs is via and Processor::open_
which calls back with and OPEN event at which point they set VIOs mutex
field and from this point on, access to the VC is
I have a patch submitted for TS-857 which attempts to make a start on finer
grained locking. The essence of it is
1) Provide more powerful and standards compliant reference counted smart
pointers - lib/ts/IntrusivePtr.h
2) Provide reference counted lists that are compatible with both (1) and
e
28 matches
Mail list logo