On Mon, Dec 15, 2014 at 9:23 AM, John Baldwin wrote:
> On Wednesday, December 10, 2014 12:47:02 PM Jason Wolfe wrote:
>> John,
>>
>> So apparently the concurrent timer scheduling was not fixed, though it
>> does seem rarer. We had about 2 weeks of stability, then last night
>> we had a crash on a
On Thursday, October 23, 2014 02:12:44 PM Jason Wolfe wrote:
> On Sat, Oct 18, 2014 at 4:42 AM, John Baldwin wrote:
> > On Friday, October 17, 2014 11:32:13 PM Jason Wolfe wrote:
> >> Producing 10G of random traffic against a server with this assertion
> >> added took about 2 hours to panic, so if
On Sat, Oct 18, 2014 at 4:42 AM, John Baldwin wrote:
> On Friday, October 17, 2014 11:32:13 PM Jason Wolfe wrote:
>> Producing 10G of random traffic against a server with this assertion
>> added took about 2 hours to panic, so if it turns out we need anything
>> further it should be pretty quick.
On Friday, October 17, 2014 11:43:26 PM Adrian Chadd wrote:
> Hm, is this the bug that was just fixed in -HEAD?
>
> I saw this similar bug on -HEAD with lots of quick connections and
> reused ports. It ended up deferencing a NULL tcp timer pointer from
> the inpcb. Is that what the code in your tr
On Friday, October 17, 2014 11:32:13 PM Jason Wolfe wrote:
> Producing 10G of random traffic against a server with this assertion
> added took about 2 hours to panic, so if it turns out we need anything
> further it should be pretty quick.
>
> #4 list
> 2816 * timer and remembe
Hm, is this the bug that was just fixed in -HEAD?
I saw this similar bug on -HEAD with lots of quick connections and
reused ports. It ended up deferencing a NULL tcp timer pointer from
the inpcb. Is that what the code in your tree is doing?
-a
On 17 October 2014 23:32, Jason Wolfe wrote:
> On
On Thu, Oct 16, 2014 at 12:23 PM, John Baldwin wrote:
>
>
> I looked at the other trace and I don't think it disagrees with my previous
> theory. I do have more KTR patches to log when we spin on locks which would
> really confirm this, but I haven't tested those fully on HEAD yet.
>
> However, I
On Thu, Oct 16, 2014 at 12:23 PM, John Baldwin wrote:
>
>
> I looked at the other trace and I don't think it disagrees with my previous
> theory. I do have more KTR patches to log when we spin on locks which would
> really confirm this, but I haven't tested those fully on HEAD yet.
>
> However, I
On Saturday, October 11, 2014 2:19:19 am Jason Wolfe wrote:
> On Fri, Oct 10, 2014 at 8:53 AM, John Baldwin wrote:
>
> > On Thursday, October 09, 2014 02:31:32 PM Jason Wolfe wrote:
> > > On Wed, Oct 8, 2014 at 12:29 PM, John Baldwin wrote:
> > > > My only other thought is if a direct timeout ro
On Fri, Oct 10, 2014 at 11:19 PM, Jason Wolfe wrote:
> On Fri, Oct 10, 2014 at 8:53 AM, John Baldwin wrote:
>
>> On Thursday, October 09, 2014 02:31:32 PM Jason Wolfe wrote:
>> > On Wed, Oct 8, 2014 at 12:29 PM, John Baldwin wrote:
>> > > My only other thought is if a direct timeout routine ran
On Fri, Oct 10, 2014 at 8:53 AM, John Baldwin wrote:
> On Thursday, October 09, 2014 02:31:32 PM Jason Wolfe wrote:
> > On Wed, Oct 8, 2014 at 12:29 PM, John Baldwin wrote:
> > > My only other thought is if a direct timeout routine ran for a long
> time.
> > >
> > > I just committed a change to
On Thursday, October 09, 2014 02:31:32 PM Jason Wolfe wrote:
> On Wed, Oct 8, 2014 at 12:29 PM, John Baldwin wrote:
> > My only other thought is if a direct timeout routine ran for a long time.
> >
> > I just committed a change to current that can let you capture KTR traces
> > of
> > callout rou
On Wed, Oct 8, 2014 at 12:29 PM, John Baldwin wrote:
> My only other thought is if a direct timeout routine ran for a long time.
>
> I just committed a change to current that can let you capture KTR traces of
> callout routines for use with schedgraph (r272757). Unfortunately,
> enabling KTR_SCH
On Wednesday, October 08, 2014 10:56:56 AM Jason Wolfe wrote:
> On Tue, Oct 7, 2014 at 11:28 AM, John Baldwin wrote:
> > On Tuesday, October 07, 2014 2:06:42 pm Jason Wolfe wrote:
> > > Hey John,
> > >
> > > Happy to do this, but the pool of boxes is about 500 large, which is the
> > > reason I'm
On Tue, Oct 7, 2014 at 11:28 AM, John Baldwin wrote:
> On Tuesday, October 07, 2014 2:06:42 pm Jason Wolfe wrote:
> > Hey John,
> >
> > Happy to do this, but the pool of boxes is about 500 large, which is the
> > reason I'm able to see a crash every day or so. I've pulled a portion of
> > them o
On Thursday, October 02, 2014 06:40:21 PM Jason Wolfe wrote:
> On Wed, Sep 10, 2014 at 8:24 AM, John Baldwin wrote:
> > On Monday, September 08, 2014 03:34:02 PM Eric van Gyzen wrote:
> > > On 09/08/2014 15:19, Sean Bruno wrote:
> > > > On Mon, 2014-09-08 at 12:09 -0700, Sean Bruno wrote:
> > > >>
On Wed, Sep 10, 2014 at 8:24 AM, John Baldwin wrote:
> On Monday, September 08, 2014 03:34:02 PM Eric van Gyzen wrote:
> > On 09/08/2014 15:19, Sean Bruno wrote:
> > > On Mon, 2014-09-08 at 12:09 -0700, Sean Bruno wrote:
> > >> This sort of looks like the hardware failed to respond to us in time?
On Monday, September 08, 2014 03:34:02 PM Eric van Gyzen wrote:
> On 09/08/2014 15:19, Sean Bruno wrote:
> > On Mon, 2014-09-08 at 12:09 -0700, Sean Bruno wrote:
> >> This sort of looks like the hardware failed to respond to us in time?
> >> Too busy?
> >>
> >> sean
> >
> > This seems to be affec
On Mon, 2014-09-08 at 15:34 -0400, Eric van Gyzen wrote:
> >> Unread portion of the kernel message buffer:
> >> spin lock 0x812a0400 (callout) held by 0xf800151fe000
> (tid
> >> 13) too long
>
> TID 13 is usually a kernel idle thread, which would seem to
> indicate
> a dangling
On 09/08/2014 15:19, Sean Bruno wrote:
> On Mon, 2014-09-08 at 12:09 -0700, Sean Bruno wrote:
>> This sort of looks like the hardware failed to respond to us in time?
>> Too busy?
>>
>> sean
>>
> This seems to be affecting my 10/stable machines from 15Aug2014.
>
> Not a lot of churn in the code s
On Mon, 2014-09-08 at 12:09 -0700, Sean Bruno wrote:
> This sort of looks like the hardware failed to respond to us in time?
> Too busy?
>
> sean
>
This seems to be affecting my 10/stable machines from 15Aug2014.
Not a lot of churn in the code so I don't think this is new. The
afflicted machi
This sort of looks like the hardware failed to respond to us in time?
Too busy?
sean
panic: spin lock held too long
GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you
are
welcome to change it and/or dist
22 matches
Mail list logo