Re: Intermittent system hangs on 7.2-RELEASE-p1

2009-09-14 Thread Guy Helmer
Linda Messerschmidt wrote: Well, this is interesting. I got really frustrated with the other approach, so I thought I'd thin a machine down absolutely as far as I could, eliminate every possible source of delay, and see what happens. I killed everything... cron, RPC, NFS, devd, gmon, nrpe, ever

Re: Intermittent system hangs on 7.2-RELEASE-p1

2009-09-12 Thread Linda Messerschmidt
OK, first, I figured out the seven second thing. I actually had already found that particular issue earlier in the troubleshooting process, but forgot all about it when I pulled in a second machine to test with. It was simply a case of setting Apache's MaxRequestsPerChild to a very low value (128

Re: Intermittent system hangs on 7.2-RELEASE-p1

2009-09-11 Thread Linda Messerschmidt
On Sat, Sep 12, 2009 at 2:52 AM, Linda Messerschmidt wrote: > On Sat, Sep 12, 2009 at 1:47 AM, Julian Elischer wrote: >> ok now we need to describe the hang..  if you can predictably get a hang >> every 7 seconds does this mean that it doesn't respond to keyboard for a >> moment every 7 seconds?

Re: Intermittent system hangs on 7.2-RELEASE-p1

2009-09-11 Thread Linda Messerschmidt
On Sat, Sep 12, 2009 at 1:47 AM, Julian Elischer wrote: > ok now we need to describe the hang..  if you can predictably get a hang > every 7 seconds does this mean that it doesn't respond to keyboard for a > moment every 7 seconds? It's possible. > or that it doesn't accept packets every 7 secon

Re: Intermittent system hangs on 7.2-RELEASE-p1

2009-09-11 Thread Julian Elischer
Linda Messerschmidt wrote: On Sat, Sep 12, 2009 at 12:06 AM, Julian Elischer wrote: does the system have a serial console? how about a normal console /keyboard? It has an IP KVM. how often deos it hang? and for how long? Well, this is interesting. I got really frustrated with the other

Re: Intermittent system hangs on 7.2-RELEASE-p1

2009-09-11 Thread Linda Messerschmidt
On Sat, Sep 12, 2009 at 12:06 AM, Julian Elischer wrote: > does the system have a serial console? how about a normal console /keyboard? It has an IP KVM. > how often deos it hang? and for  how long? Well, this is interesting. I got really frustrated with the other approach, so I thought I'd th

Re: Intermittent system hangs on 7.2-RELEASE-p1

2009-09-11 Thread Julian Elischer
Linda Messerschmidt wrote: OK, I have learned that ktrdump looks up the name of the process associated with a particular KSE at the the time of the dump, so if it's changed since tracing stopped, it will blissfully blame the wrong process. I understand why that's the case, but it still sucks for

Re: Intermittent system hangs on 7.2-RELEASE-p1

2009-09-11 Thread Linda Messerschmidt
OK, I have learned that ktrdump looks up the name of the process associated with a particular KSE at the the time of the dump, so if it's changed since tracing stopped, it will blissfully blame the wrong process. I understand why that's the case, but it still sucks for troubleshooting. :( This ti

Re: Intermittent system hangs on 7.2-RELEASE-p1

2009-09-11 Thread Linda Messerschmidt
On Fri, Sep 11, 2009 at 3:06 PM, John Baldwin wrote: > Something like this: Ah, I understand now. :) Got up to 17 seconds of trace with that change. > Hmm.  It works well for me for doing traces. It definitely works, it just always seems to have some-or-another weird artifact. But, with the l

Re: Intermittent system hangs on 7.2-RELEASE-p1

2009-09-11 Thread John Baldwin
On Friday 11 September 2009 1:35:00 pm Linda Messerschmidt wrote: > On Fri, Sep 11, 2009 at 11:02 AM, John Baldwin wrote: > > Try turning off KTR_LOCK for spin mutexes (just force LO_QUIET on in > > mtx_init() if MTX_SPIN is set) > > I have *no* idea what you just said. :) > > Which is fine. Bu

Re: Intermittent system hangs on 7.2-RELEASE-p1

2009-09-11 Thread John Baldwin
On Friday 11 September 2009 11:35:14 am Julian Elischer wrote: > John Baldwin wrote: > > > > > > A more recently schedgraph.py might also > > fix the bugs you were seeing with the idle threads looking too long (esp. at > > the start and end of graphs). > > not unless something has been fixed

Re: Intermittent system hangs on 7.2-RELEASE-p1

2009-09-11 Thread Linda Messerschmidt
On Fri, Sep 11, 2009 at 11:02 AM, John Baldwin wrote: > Try turning off KTR_LOCK for spin mutexes (just force LO_QUIET on in > mtx_init() if MTX_SPIN is set) I have *no* idea what you just said. :) Which is fine. But more to the point, I have no idea how to do it. :) > A more recently schedgra

Re: Intermittent system hangs on 7.2-RELEASE-p1

2009-09-11 Thread Julian Elischer
John Baldwin wrote: A more recently schedgraph.py might also fix the bugs you were seeing with the idle threads looking too long (esp. at the start and end of graphs). not unless something has been fixed in the last week or so. ___ freebsd-hac

Re: Intermittent system hangs on 7.2-RELEASE-p1

2009-09-11 Thread John Baldwin
On Thursday 10 September 2009 9:34:30 pm Linda Messerschmidt wrote: > Just to follow up, I've been doing some testing with masking for > KTR_LOCK rather than KTR_SCHED. > > I'm having trouble with this because I have the KTR buffer size set to > 1048576 entries, and with only KTR_LOCK enabled, thi

Re: Intermittent system hangs on 7.2-RELEASE-p1

2009-09-10 Thread Linda Messerschmidt
Just to follow up, I've been doing some testing with masking for KTR_LOCK rather than KTR_SCHED. I'm having trouble with this because I have the KTR buffer size set to 1048576 entries, and with only KTR_LOCK enabled, this isn't enough for even a full second of tracing; the sample I'm working with

Re: Intermittent system hangs on 7.2-RELEASE-p1

2009-09-10 Thread Linda Messerschmidt
On Thu, Sep 10, 2009 at 2:46 PM, Julian Elischer wrote: > I've noticed that schedgraph tends to show the idle threads slightly > skewed one way or the other.  I think there is a cumulative rounding > error in the way they are drawn due to the fact that they are run so > often.  Check the raw data a

Re: Intermittent system hangs on 7.2-RELEASE-p1

2009-09-10 Thread Julian Elischer
Linda Messerschmidt wrote: On Thu, Sep 10, 2009 at 12:57 PM, Ryan Stone wrote: You should be able to run schedgraph.py on a windows machine with python installed. It works just fine for me on XP. Don't have any of those either, but I *did* get it working on a Mac right out of the box. Should

Re: Intermittent system hangs on 7.2-RELEASE-p1

2009-09-10 Thread Linda Messerschmidt
On Thu, Sep 10, 2009 at 12:57 PM, Ryan Stone wrote: > You should be able to run schedgraph.py on a windows machine with python > installed.  It works just fine for me on XP. Don't have any of those either, but I *did* get it working on a Mac right out of the box. Should have thought of that soone

Re: Intermittent system hangs on 7.2-RELEASE-p1

2009-09-10 Thread Ryan Stone
You should be able to run schedgraph.py on a windows machine with python installed. It works just fine for me on XP. ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "

Re: Intermittent system hangs on 7.2-RELEASE-p1

2009-09-10 Thread Linda Messerschmidt
On Thu, Aug 27, 2009 at 5:29 PM, John Baldwin wrote: > Ah, cool, what you want to do is use KTR with KTR_SCHED and then use > schedgraph.py (src/tools/sched) to get a visual picture of what the box does > during a hang.  The timestamps in KTR are TSC cycle counts rather than an > actual wall time w

Re: Intermittent system hangs on 7.2-RELEASE-p1

2009-08-27 Thread Kostik Belousov
On Thu, Aug 27, 2009 at 04:14:39PM -0400, Linda Messerschmidt wrote: > On Wed, Aug 26, 2009 at 4:42 PM, John Baldwin wrote: > > One thing to note is that ktrace only logs voluntary context switches (i.e. > > call to tsleep or waiting on a condition variable). It specifically does > > not > > log

Re: Intermittent system hangs on 7.2-RELEASE-p1

2009-08-27 Thread Linda Messerschmidt
On Wed, Aug 26, 2009 at 4:42 PM, John Baldwin wrote: > One thing to note is that ktrace only logs voluntary context switches (i.e. > call to tsleep or waiting on a condition variable). It specifically does not > log preemptions or blocking on a mutex, I was not aware, thanks. > so in theory if y

Re: Intermittent system hangs on 7.2-RELEASE-p1

2009-08-26 Thread Nate Eldredge
On Wed, 26 Aug 2009, Linda Messerschmidt wrote: I'm trying to troubleshoot an intermittent Apache performance problem, and I've narrowed it down using to what appears to be a brief whole-system hang that last from 0.5 - 3 seconds. They occur every few minutes. One thought would be to use "ps"

Re: Intermittent system hangs on 7.2-RELEASE-p1

2009-08-26 Thread John Baldwin
On Wednesday 26 August 2009 3:03:13 pm Linda Messerschmidt wrote: > I'm trying to troubleshoot an intermittent Apache performance problem, > and I've narrowed it down using to what appears to be a brief > whole-system hang that last from 0.5 - 3 seconds. They occur every > few minutes. One thing