On Sat, 2007-01-13 at 15:11 -0500, Kris Kennaway wrote: > On Sat, Dec 30, 2006 at 06:04:13PM -0500, Sven Willenberger wrote: > > > > > > Sven Willenberger presumably uttered the following on 12/18/06 12:33: > > > On Fri, 2006-12-15 at 23:20 +0200, Kostik Belousov wrote: > > >> On Fri, Dec 15, 2006 at 02:29:58PM -0500, Kris Kennaway wrote: > > > > > > <<SNIP>> > > > > > >>> > > >>>> FWIW, I do see the following appearing in the /var/log/messages: > > >>>> ufs_rename: fvp == tvp (can't happen) > > >>>> about once or twice a day, but cannot correlate those to lockup. Now > > >>>> that I have enabled the options mentioned above in the kernel, I am > > >>>> seeing some LOR issues: > > >>>> > > >>>> kernel: lock order reversal: > > >>>> kernel: 1st 0xffffff00c3bab200 kqueue (kqueue) @ > > >>>> /usr/src/sys/kern/kern_event.c:1547 > > >>>> kernel: 2nd 0xffffff0005bb6078 struct mount mtx (struct mount mtx) @ > > >>>> /usr/src/sys/ufs/ufs/ufs_vnops.c:138 > > >>> OK, this is interesting, so let's proceed from here. > > >>> > > >>> Kris > > >> Try this. > > >> > > >> Index: ufs/ufs/ufs_vnops.c > > >> =================================================================== > > >> RCS file: /usr/local/arch/ncvs/src/sys/ufs/ufs/ufs_vnops.c,v > > >> retrieving revision 1.283 > > >> diff -u -r1.283 ufs_vnops.c > > >> --- ufs/ufs/ufs_vnops.c 6 Nov 2006 13:42:09 -0000 1.283 > > >> +++ ufs/ufs/ufs_vnops.c 15 Dec 2006 21:19:51 -0000 > > >> @@ -133,19 +133,15 @@ > > >> { > > >> struct inode *ip; > > >> struct timespec ts; > > >> - int mnt_locked; > > >> > > >> ip = VTOI(vp); > > >> - mnt_locked = 0; > > >> - if ((vp->v_mount->mnt_flag & MNT_RDONLY) != 0) { > > >> - VI_LOCK(vp); > > >> + VI_LOCK(vp); > > >> + if ((vp->v_mount->mnt_flag & MNT_RDONLY) != 0) > > >> goto out; > > >> + if ((ip->i_flag & (IN_ACCESS | IN_CHANGE | IN_UPDATE)) == 0) { > > >> + VI_UNLOCK(vp); > > >> + return; > > >> } > > >> - MNT_ILOCK(vp->v_mount); /* For reading of > > >> mnt_kern_flags. */ > > >> - mnt_locked = 1; > > >> - VI_LOCK(vp); > > >> - if ((ip->i_flag & (IN_ACCESS | IN_CHANGE | IN_UPDATE)) == 0) > > >> - goto out_unl; > > >> > > >> if ((vp->v_type == VBLK || vp->v_type == VCHR) && > > >> !DOINGSOFTDEP(vp)) > > >> ip->i_flag |= IN_LAZYMOD; > > >> @@ -172,10 +168,7 @@ > > >> > > >> out: > > >> ip->i_flag &= ~(IN_ACCESS | IN_CHANGE | IN_UPDATE); > > >> - out_unl: > > >> VI_UNLOCK(vp); > > >> - if (mnt_locked) > > >> - MNT_IUNLOCK(vp->v_mount); > > >> } > > >> > > >> /* > > > > > > > > > Patch applied cleanly (offset 6 lines), make buildworld, make kernel, > > > reboot, make installworld, etc. > > > > > > kernel: lock order reversal: > > > kernel: 1st 0xffffff00b9181800 kqueue (kqueue) @ > > > /usr/src/sys/kern/kern_event.c:1547 > > > kernel: 2nd 0xffffff00c16030d0 vnode interlock (vnode interlock) @ > > > /usr/src/sys/ufs/ufs/ufs_vnops.c:132 > > > > > > > > > > > > _______________________________________________ > > > > Having enabled witness and ddb, etc I cannot get this LOR to trigger > > anymore, but > > the machine is still locking up. I finally managed to get a piece of what > > was > > appearing on the console which is the following (copied by hand by an > > onsite tech so > > there may be a typo here and there): > > > > --------cut-------------- > > > > bge_intr() at loge_intr+0x84a > > ithread_loop() at ithread_loop+0x14c > > fork_exit() at fork_exit+0xbb > > fork_trampoline() at fork_trampoline+0xee > > --- trap 0, rip-0, rsp-0xffffffffb371ad00, rbp-0 --- > > > > Fatal trap 12: page fault while in Kernel Mode > > cupid=1, apic id=01 > > fault virtual address - 0x28 > > fault code - supervisor write, page not present > > instruction pointer - 0x8:0xffffffff801dae1a > > stack pointer - 0x10:0xffffffffb371ab70 > > frame pointer - 0x10:0xffffffffb371abd0 > > code segment - base 0x0, limit 0xfffff, type 0x1b > > - DPL 0, pres 1, long 1, def32 0, gram 1 > > > > processor eflags=interrupt enabled, resume, IOPL=0 > > current process=28 (irq 24:bge0) > > trap number=12 > > panic: page fault > > cupid=1 > > > > Uptime - 4d10h52m36s > > Dumping 4031MB (2 chunks) > > chunk0: 1MB (156 pages)... ok > > chunk1: 4031MB (1031920) > > > > ----------cut----------------- > > > > For some reason, by the time it reboots, there is no dump file available > > (even > > though it is enabled in rc.conf and there is more than enough room in > > /var/crash to > > hold it). > > This is indicating a problem either with your bge hardware or the driver. > > Kris
I suspect the driver: This same hardware setup was being used as a databse server with FreeBSD 5.4. I had been using the bge driver but set at base100T without any issue at all. It was when I did a clean install of 6.2-Prerelease and setting bge to use the full gigE speed (via autonegotiate) that these issues cropped up. Since changing to the onboard fxp interface I have (knock on wood) not had an issue in some 7 days (as opposed to the random lockups/reboots occuring every 3 days using bge). Sven _______________________________________________ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"