I remember seeing this panic many months back. I had fixed this in our Freebsd source base.
Here is the scenario: vrele() { if (vp->v_usecount == 1) { vp->usecount--; if (VSHOULDFREE(vp)) vfree(vp); vn_lock(vp, LK_EXCLUSIVE | LK_INTERLOCK, p); ---- (A) VOP_INACTIVE() { /* translates to nfs_inactive() */ sp = np->n_sillyrename; if (sp) { if (ap->a_vp->v_usecount > 0) { ... } else { vget(ap->a_vp, 0, p); nfs_vinvalbuf(ap->a_vp, 0, ...); vrele(ap->a_vp); ---- (B) } ... } } } } vrele() calls nfs_inactive() since the vp->v_usecount == 0. Before calling nfs_inactive, it locks the vnode at (A) (since VOP_INACTIVE() expects the vnode to be locked and VOP_INACTIVE() releases the lock at the end). In nfs_inactive() a check is made for to see if the file was renamed/deleted while it was still in use. In this case it was, and so np->n_sillyrename is valid. To make sure that the vnode does not get reused while doing the I/O associated with discarding the buffers, the use_count is bumped up using vget(). Then nfs_vinvalbuf() is called to flush and invalidate all dirty buffers. After that vrele() is done on the vnode at (B). vrele() again comes to (A) and tries to lock the vnode, which was already locked before by the same process in the 1st pass. The kernel panics with "locking against myself". The vrele() that is done at (B) causes the cycle to be repeated. vrele() calls nfs_inactive() which calls vrele() and so forth. In nfs_inactive(), we know that the function calling it will free the vnode. So we need not call vfree(). We also know that we should not lock the vnode, again, and call VOP_INACTIVE(). So the fix is to inline the part of vrele() that just decrements the usecount. The following would be the fix. This is 4.3 source base. ==== //depot/dev/freebsd-brick/src/sys/nfs/nfs_node.c#3 (text) ==== 225c225,244 < vrele(ap->a_vp); --- > /* > * Inline part of vrele() since VOP_INACTIVE > * has already been called. > */ > simple_lock(&ap->a_vp->v_interlock); > if (--ap->a_vp->v_usecount <= 0) { > #ifdef DIAGNOSTIC > if (ap->a_vp->v_usecount < 0 || > ap->a_vp->v_writecount != 0) { > vprint("nfs_inactive: bad ref count", > ap->a_vp); > panic("nfs_inactive: ref cnt"); > } > #endif > /* > * Since this is nfs_inactive(), vfree() > * would be called from the caller. > */ > } > simple_unlock(&ap->a_vp->v_interlock); Deepankar Don Lewis wrote: > I've gotten a couple "lockmgr: locking against myself" panics today and > it seems to be somewhat reproduceable. I had the serial console hooked > up for the last one, so I was able to get a stack trace: > > > panic: lockmgr: locking against myself > Debugger("panic") > Stopped at Debugger+0x45: xchgl %ebx,in_Debugger.0 > db> tr > Debugger(c0449abc) at Debugger+0x45 > panic(c0447760,215,0,c6cd7de0,10002) at panic+0x7c > lockmgr(c6cd7ea8,10002,c6cd7de0,c6381f00,c6cd7de0) at lockmgr+0x412 > vop_sharedlock(e4b90b58,c047c440,c6cd7de0,1010002,c6381f00) at vop_sharedlock+0x > 6e > vn_lock(c6cd7de0,10002,c6381f00) at vn_lock+0xb4 > vrele(c6cd7de0,c6cd7de0,0,c681cb00,c6381f00,1) at vrele+0x8d > nfs_inactive(e4b90bdc,c047c3c0,c6cd7de0,c6381f00,c6381f00) at nfs_inactive+0xbd > VOP_INACTIVE(c6cd7de0,c6381f00) at VOP_INACTIVE+0xa0 > vrele(c6cd7de0,c6834474,c6834474,e4b90c38,c02eed4a) at vrele+0x9b > vn_close(c6cd7de0,2,c681cb00,c6381f00,e4b90c70) at vn_close+0x37 > vn_closefile(c6834474,c6381f00) at vn_closefile+0x1e > fdrop_locked(c6834474,c6381f00,c04fe8ac,0,c0446020) at fdrop_locked+0x102 > fdrop(c6834474,c6381f00,c028a1ec,c6381f00,c6377c8c) at fdrop+0x24 > closef(c6834474,c6381f00,c6834474,c6c0d3f8,c6381f00) at closef+0x77 > close(c6381f00,e4b90d14,1,b9,202) at close+0x107 > syscall(2f,2f,2f,82124cc,23) at syscall+0x243 > Xint0x80_syscall() at Xint0x80_syscall+0x1d > --- syscall (6, FreeBSD ELF32, close), eip = 0x4862ff3b, esp = 0xbfbff2dc, ebp = > 0xbfbff358 --- > > db> show lockedvnods > Locked vnodes > 0xc6cd7de0: type VREG, usecount 0, writecount 0, refcount 0, flags (VV_OBJBUF) > tag VT_NFS, fileid 112174 fsid 0x100ff02 > > > It looks like the call to vrele() from vn_close() is executing the > following code: > > if (vp->v_usecount == 1) { > vp->v_usecount--; > /* > * We must call VOP_INACTIVE with the node locked. > * If we are doing a vput, the node is already locked, > * but, in the case of vrele, we must explicitly lock > * the vnode before calling VOP_INACTIVE. > */ > if (vn_lock(vp, LK_EXCLUSIVE | LK_INTERLOCK, td) == 0) > VOP_INACTIVE(vp, td); > VI_LOCK(vp); > if (VSHOULDFREE(vp)) > vfree(vp); > else > vlruvp(vp); > VI_UNLOCK(vp); > > It has decremented v_usecount to 0, grabbed an exclusive lock on the > vnode, and has called VOP_INACTIVE(). > > It looks like nfs_inactive() is executing the following code: > > if (sp) { > /* > * We need a reference to keep the vnode from being > * recycled by getnewvnode while we do the I/O > * associated with discarding the buffers unless we > * are being forcibly unmounted in which case we already > * have our own reference. > */ > if (ap->a_vp->v_usecount > 0) > (void) nfs_vinvalbuf(ap->a_vp, 0, sp->s_cred, td, 1); > else if (vget(ap->a_vp, 0, td)) > panic("nfs_inactive: lost vnode"); > else { > (void) nfs_vinvalbuf(ap->a_vp, 0, sp->s_cred, td, 1); > vrele(ap->a_vp); > } > > Because v_usecount was 0, nfs_inactive() calls vget(), which increments > v_usecount to 1, has called nfs_vinvalbuf(), and finally calls vrele() > again. > > Because v_usecount is 1, vrele() is re-executing the same code and blows > up when it tries to grab the lock again. If that didn't cause a panic, > VOP_INACTIVE() and nfs_inactive() would be called recursively, which > would probably be a bad thing ... > > I suspect that nfs_inactive() needs to avoid calling vrele() here and > should just decrement the v_usecount with the appropriate locking. > Something like: > VI_LOCK(ap->a_vp); > ap->a_vp->v_usecount--; > VI_UNLOCK(ap->a_vp); > > > To Unsubscribe: send mail to [EMAIL PROTECTED] > with "unsubscribe freebsd-current" in the body of the message > -- ------------------------------------------------------------------------------- Deepankar Das Panasas Inc. Pioneering the World's Most Scalable and Agile Storage Network www.panasas.com [EMAIL PROTECTED] 510-608-4366 To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message