Re: Intermittent system hangs on 7.2-RELEASE-p1

Linda Messerschmidt Fri, 11 Sep 2009 20:55:59 -0700

OK, I have learned that ktrdump looks up the name of the process
associated with a particular KSE at the the time of the dump, so if
it's changed since tracing stopped, it will blissfully blame the wrong
process.  I understand why that's the case, but it still sucks for
troubleshooting. :(


This time, "pf task mtx" and "vnode_free_list" are the locks getting
the blame.  The processes fingered are an httpd ( (the root "parent"
of the one doing the work, which does nothing but select() for 1s and
wait to see if its children died), and vnlru.  No correlation at all
to the previous results, and this machine is now utterly quiescent
except for the httpd process and the PHP exerciser.  Hard to imagine
vnlru has 1s worth of running to do on a machine with 949 total vnodes
in use.

A third run produced a 997ms "lock acquire" for "buffer daemon lock,"
a 497ms one for ip6qlock (no, there's no IPv6 in use on this machine),
and an 8s (!!!) one on unp_mtx. bufdaemon had a 997s "running" bar,
but according to the raw TSC values, that happened on the same CPU
1.999s *after* the 997ms buffer daemon lock acquire.

I really don't know where to go from here.  There's so little
consistency that I'm just not sure if the data is bad, the tool is
bad, the operator is bad, or there's some problem so fundamentally
horrible that all I'm seeing is random side effects.
_______________________________________________
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"

Re: Intermittent system hangs on 7.2-RELEASE-p1

Reply via email to