Le 25/07/2011 11:59, Kostik Belousov a ?crit: > On Mon, Jul 25, 2011 at 12:21:07PM +0200, Herve Boulouis wrote: > > Hi list, > > > > We have 2 freebsd 8.2-STABLE (cvsuped june 22) that keeps crashing in a bad > > way : > > > > The are doing heavy apache / php4 web serving from a nfs mount and panic at > > least once a day > > with the following message (no crash dump produced, hand copied from the > > console) : > > > > Sleeping on "vmopar" with the following non-sleepable locks held: > > exclusive sleep mutex NFSnode lock (NFSnode lock) r = 0 > > (0xffffff0201798000) locked @ nfsclient/nfs_subs.c:538 > > lock order reversal: > > 1st 0xffffffff018ff6da80 turnstile lock (turnstile lock) @ > > kern/subr_turnstile.c:190 > > 2nd 0xffffffffff80b52b10 scrlock (scrlock) @ dev/syscons.c:2570 > > lock order reversal: > > 1st 0xffffffff018ff6da80 turnstile lock (turnstile lock) @ > > kern/subr_turnstile.c:190 > > 2nd 0xffffffffff80b78ef8 sleepq chain (sleepq chain) @ > > kern/subr_turnstile.c:203 > > lock order reversal: > > 1st 0xffffffffff80b78ef8 sleepq chain (sleepq chain) @ > > kern/subr_turnstile.c:203 > > 2nd 0xffffffffff80b52b10 scrlock (scrlock) @ dev/syscons.c:2570 > > Sleeping thread (tid 100998, pid 20700) owns a non-sleepable lock > > panic: sleeping thread > > cpuid = 1 > > panic: bufwrite: buffer is not busy??? > > cpuid = 1 > > > > The 2 servers share the same load and panic consistently. I enabled WITNESS > > on the 2 in the hope > > it would allow the boxes to auto reboot after panic and get extra debug > > info. I got debug info > > but the servers still hangs after the double panic :( > > Try this. Calling vnode_pager_setsize() while holding a mutex is prohibited. > On the other hand, I remember that my attempt to add a strict assert > that a vnode is exclusively locked in vnode_pager_setsize() had to be > reversed because nfs_loadattrcache() sometimes called without vnode > lock held. > > commit 2aa7d15c38b0c01e3f724f04d7ed02ce11c82cc0 > Author: Konstantin Belousov <kostik...@gmail.com> > Date: Mon Jul 25 11:56:04 2011 +0300 > > Postpone the vnode_pager_setsize() call until the nfs node mutex is > dropped.
1 of the boxes crashed so its kernel is now running with the patch. I still get the 3 LORs when services are starting thought : lock order reversal: 1st 0xffffff81ee061268 bufwait (bufwait) @ kern/vfs_bio.c:2636 2nd 0xffffff0006901000 dirhash (dirhash) @ ufs/ufs/ufs_dirhash.c:285 lock order reversal: 1st 0xffffff0125236c88 so_snd_sx (so_snd_sx) @ kern/uipc_sockbuf.c:145 2nd 0xffffff01256e9448 nfs (nfs) @ kern/uipc_syscalls.c:2086 lock order reversal: 1st 0xffffff01253e1c88 so_snd_sx (so_snd_sx) @ kern/uipc_sockbuf.c:145 2nd 0xffffff01252b5620 ufs (ufs) @ kern/uipc_syscalls.c:2086 I'll keep you posted if the patch improves the stability or not. Regards, -- Herve Boulouis _______________________________________________ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"