Re: Not panic in nfsd (Re: panic in nfsd on 6.2-RC1)

Sven Willenberger Mon, 15 Jan 2007 08:33:18 -0800

On Sat, 2007-01-13 at 15:11 -0500, Kris Kennaway wrote:
> On Sat, Dec 30, 2006 at 06:04:13PM -0500, Sven Willenberger wrote:
> > 
> > 
> > Sven Willenberger presumably uttered the following on 12/18/06 12:33:
> > > On Fri, 2006-12-15 at 23:20 +0200, Kostik Belousov wrote:
> > >> On Fri, Dec 15, 2006 at 02:29:58PM -0500, Kris Kennaway wrote:
> > > 
> > > <<SNIP>>
> > > 
> > >>>  
> > >>>> FWIW, I do see the following appearing in the /var/log/messages:
> > >>>> ufs_rename: fvp == tvp (can't happen) 
> > >>>> about once or twice a day, but cannot correlate those to lockup. Now
> > >>>> that I have enabled the options mentioned above in the kernel, I am
> > >>>> seeing some LOR issues:
> > >>>>
> > >>>> kernel: lock order reversal:
> > >>>> kernel: 1st 0xffffff00c3bab200 kqueue (kqueue) @ 
> > >>>> /usr/src/sys/kern/kern_event.c:1547
> > >>>> kernel: 2nd 0xffffff0005bb6078 struct mount mtx (struct mount mtx) @ 
> > >>>> /usr/src/sys/ufs/ufs/ufs_vnops.c:138
> > >>> OK, this is interesting, so let's proceed from here.
> > >>>
> > >>> Kris
> > >> Try this.
> > >>
> > >> Index: ufs/ufs/ufs_vnops.c
> > >> ===================================================================
> > >> RCS file: /usr/local/arch/ncvs/src/sys/ufs/ufs/ufs_vnops.c,v
> > >> retrieving revision 1.283
> > >> diff -u -r1.283 ufs_vnops.c
> > >> --- ufs/ufs/ufs_vnops.c  6 Nov 2006 13:42:09 -0000       1.283
> > >> +++ ufs/ufs/ufs_vnops.c  15 Dec 2006 21:19:51 -0000
> > >> @@ -133,19 +133,15 @@
> > >>  {
> > >>          struct inode *ip;
> > >>          struct timespec ts;
> > >> -        int mnt_locked;
> > >>  
> > >>          ip = VTOI(vp);
> > >> -        mnt_locked = 0;
> > >> -        if ((vp->v_mount->mnt_flag & MNT_RDONLY) != 0) {
> > >> -                VI_LOCK(vp);
> > >> +        VI_LOCK(vp);
> > >> +        if ((vp->v_mount->mnt_flag & MNT_RDONLY) != 0)
> > >>                  goto out;
> > >> +        if ((ip->i_flag & (IN_ACCESS | IN_CHANGE | IN_UPDATE)) == 0) {
> > >> +                VI_UNLOCK(vp);
> > >> +                return;
> > >>          }
> > >> -        MNT_ILOCK(vp->v_mount);         /* For reading of 
> > >> mnt_kern_flags. */
> > >> -        mnt_locked = 1;
> > >> -        VI_LOCK(vp);
> > >> -        if ((ip->i_flag & (IN_ACCESS | IN_CHANGE | IN_UPDATE)) == 0)
> > >> -                goto out_unl;
> > >>  
> > >>          if ((vp->v_type == VBLK || vp->v_type == VCHR) && 
> > >> !DOINGSOFTDEP(vp))
> > >>                  ip->i_flag |= IN_LAZYMOD;
> > >> @@ -172,10 +168,7 @@
> > >>  
> > >>   out:
> > >>          ip->i_flag &= ~(IN_ACCESS | IN_CHANGE | IN_UPDATE);
> > >> - out_unl:
> > >>          VI_UNLOCK(vp);
> > >> -        if (mnt_locked)
> > >> -                MNT_IUNLOCK(vp->v_mount);
> > >>  }
> > >>  
> > >>  /*
> > > 
> > > 
> > > Patch applied cleanly (offset 6 lines), make buildworld, make kernel,
> > > reboot, make installworld, etc.
> > > 
> > > kernel: lock order reversal:
> > > kernel: 1st 0xffffff00b9181800 kqueue (kqueue) @ 
> > > /usr/src/sys/kern/kern_event.c:1547
> > > kernel: 2nd 0xffffff00c16030d0 vnode interlock (vnode interlock) @ 
> > > /usr/src/sys/ufs/ufs/ufs_vnops.c:132
> > > 
> > > 
> > > 
> > > _______________________________________________
> > 
> > Having enabled witness and ddb, etc I cannot get this LOR to trigger 
> > anymore, but
> > the machine is still locking up. I finally managed to get a piece of what 
> > was
> > appearing on the console which is the following (copied by hand by an 
> > onsite tech so
> > there may be a typo here and there):
> > 
> > --------cut--------------
> > 
> > bge_intr() at loge_intr+0x84a
> > ithread_loop() at ithread_loop+0x14c
> > fork_exit() at fork_exit+0xbb
> > fork_trampoline() at fork_trampoline+0xee
> > --- trap 0, rip-0, rsp-0xffffffffb371ad00, rbp-0 ---
> > 
> > Fatal trap 12: page fault while in Kernel Mode
> > cupid=1, apic id=01
> > fault virtual address - 0x28
> > fault code - supervisor write, page not present
> > instruction pointer - 0x8:0xffffffff801dae1a
> > stack pointer - 0x10:0xffffffffb371ab70
> > frame pointer - 0x10:0xffffffffb371abd0
> > code segment - base 0x0, limit 0xfffff, type 0x1b
> >              - DPL 0, pres 1, long 1, def32 0, gram 1
> > 
> > processor eflags=interrupt enabled, resume, IOPL=0
> > current process=28 (irq 24:bge0)
> > trap number=12
> > panic: page fault
> > cupid=1
> > 
> > Uptime - 4d10h52m36s
> > Dumping 4031MB (2 chunks)
> > chunk0: 1MB (156 pages)... ok
> > chunk1: 4031MB (1031920)
> > 
> > ----------cut-----------------
> > 
> > For some reason, by the time it reboots, there is no dump file available 
> > (even
> > though it is enabled in rc.conf and there is more than enough room in 
> > /var/crash to
> > hold it).
> 
> This is indicating a problem either with your bge hardware or the driver.
> 
> Kris


I suspect the driver: This same hardware setup was being used as a
databse server with FreeBSD 5.4. I had been using the bge driver but set
at base100T without any issue at all. It was when I did a clean install
of 6.2-Prerelease and setting bge to use the full gigE speed (via
autonegotiate) that these issues cropped up.

Since changing to the onboard fxp interface I have (knock on wood) not
had an issue in some 7 days (as opposed to the random lockups/reboots
occuring every 3 days using bge).

Sven

_______________________________________________
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: Not panic in nfsd (Re: panic in nfsd on 6.2-RC1)

Reply via email to