On Wed, Feb 14, 2007 at 04:09:54PM -0000, Geoff Garside wrote: > Hi, > I?m trying to get to the bottom of some issues we have been experiencing > with a server of ours. We have so far tried replacing the memory in the > server and we are still experiencing the crashes. > > If anyone has any ideas as to what could be causing this, or possible kgdb > tricks to try. > > Server details > * Dual Xeon 3GHz > * 1GB DDR2 400MHz > * 3ware 8006/2LP RAID > * 2x 160GB SATA drives > > Uname Output > # uname -a > FreeBSD xxx 5.5-RELEASE-p11 FreeBSD 5.5-RELEASE-p11 #0: Sun Feb 11 17:08:57 > GMT 2007 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/xxx i386 > > > Kernel Debugger output > # kgdb kernel.debug /usr/crash/vmcore.7 > [GDB will not be able to debug user-mode threads: /usr/lib/libthread_db.so: > Undefined symbol "ps_pglobal_lookup"] > GNU gdb 6.1.1 [FreeBSD] > Copyright 2004 Free Software Foundation, Inc. > GDB is free software, covered by the GNU General Public License, and you are > welcome to change it and/or distribute copies of it under certain > conditions. > Type "show copying" to see the conditions. > There is absolutely no warranty for GDB. Type "show warranty" for details. > This GDB was configured as "i386-marcel-freebsd". > > Unread portion of the kernel message buffer: > Cannot access memory at address 0xc0c3c3a1 > (kgdb) where > #0 doadump () at pcpu.h:160 > #1 0xc04e09f5 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:412 > #2 0xc04e0d19 in panic (fmt=0xc0623851 "%s") at > /usr/src/sys/kern/kern_shutdown.c:568 > #3 0xc0601b14 in trap_fatal (frame=0xe7231740, eva=28) at > /usr/src/sys/i386/i386/trap.c:822 > #4 0xc0601853 in trap_pfault (frame=0xe7231740, usermode=0, eva=28) at > /usr/src/sys/i386/i386/trap.c:737 > #5 0xc06014ad in trap (frame= > {tf_fs = -1068564456, tf_es = -1067319280, tf_ds = -1068564464, tf_edi > = 4, tf_esi = 0, tf_ebp = -417130596, tf_isp = -417130644, tf_ebx = 131074, > tf_edx = -1013448320, tf_ecx = 0, tf_eax = 4, tf_trapno = 12, tf_err = 2, > tf_eip = -1068229475, tf_cs = 8, tf_eflags = 66118, tf_esp = 7, tf_ss = > -417129328}) at /usr/src/sys/i386/i386/trap.c:427 > #6 0xc05ef8aa in calltrap () at /usr/src/sys/i386/i386/exception.s:140 > #7 0xc04f0018 in MD5Update (context=0x4, input=0x20002 <Address 0x20002 out > of bounds>, inputLen=3281518976) at /usr/src/sys/kern/md5c.c:172 > #8 0xc049fc21 in procfs_doprocfile (td=0xc3980180, p=0xc4951a98, > pn=0xc1fe7c00, sb=0xe72317f0, uio=0x0) at /usr/src/sys/fs/procfs/procfs.c:73 > #9 0xc04a3e90 in pfs_readlink (va=0x0) at pcpu.h:157 > #10 0xc053cbb8 in kern_readlink (td=0xc3980180, path=0x0, > pathseg=UIO_USERSPACE, buf=0x0, bufseg=UIO_USERSPACE, count=1024) at > vnode_if.h:925 > #11 0xc053cade in readlink (td=0xc3980180, uap=0x0) at > /usr/src/sys/kern/vfs_syscalls.c:2197 > #12 0xc0601e4f in syscall (frame= > {tf_fs = 47, tf_es = 47, tf_ds = 47, tf_edi = 135512892, tf_esi = > 135663632, tf_ebp = -1077940936, tf_isp = -417129116, tf_ebx = 674101364, > tf_edx = -1077941960, tf_ecx = 0, tf_eax = 58, tf_trapno = 135517392, tf_err > = 2, tf_eip = 672575044, tf_cs = 31, tf_eflags = 647, tf_esp = -1077942020, > tf_ss = 47}) at /usr/src/sys/i386/i386/trap.c:1014 > #13 0xc05ef8ff in Xint0x80_syscall () at > /usr/src/sys/i386/i386/exception.s:201 > #14 0x0000002f in ?? () > #15 0x0000002f in ?? () > #16 0x0000002f in ?? () > #17 0x0813c33c in ?? () > #18 0x08161010 in ?? () > #19 0xbfbfed38 in ?? () > #20 0xe7231d64 in ?? () > #21 0x282df874 in ?? () > #22 0xbfbfe938 in ?? () > #23 0x00000000 in ?? () > #24 0x0000003a in ?? () > #25 0x0813d4d0 in ?? () > #26 0x00000002 in ?? () > #27 0x2816ae44 in ?? () > #28 0x0000001f in ?? () > #29 0x00000287 in ?? () > #30 0xbfbfe8fc in ?? () > #31 0x0000002f in ?? () > #32 0x00000000 in ?? () > #33 0x00000000 in ?? () > #34 0x00000000 in ?? () > #35 0x00000000 in ?? () > #36 0x0bf63000 in ?? () > #37 0xc3984a98 in ?? () > #38 0xc3980180 in ?? () > #39 0xe7231600 in ?? () > #40 0xe72315e8 in ?? () > #41 0xc1efc780 in ?? () > #42 0xc04f105f in sched_switch (td=0x8161010, newtd=0x282df874, flags=Cannot > access memory at address 0xbfbfed48 > ) at /usr/src/sys/kern/sched_4bsd.c:881 > Previous frame inner to this frame (corrupt stack?) > (kgdb) > > Regards, > Geoff Garside
You may try the following patch (this seems to be the issue I fixed recently in HEAD and RELENG_6). On the other hand, I do not know right locking protocol for RELENG_5. Index: fs/procfs/procfs.c =================================================================== RCS file: /usr/local/arch/ncvs/src/sys/fs/procfs/procfs.c,v retrieving revision 1.11.2.1 diff -u -r1.11.2.1 procfs.c --- fs/procfs/procfs.c 31 Jan 2005 23:25:58 -0000 1.11.2.1 +++ fs/procfs/procfs.c 14 Feb 2007 17:59:12 -0000 @@ -69,10 +69,12 @@ { char *fullpath = "unknown"; char *freepath = NULL; + struct vnode *textvp; - vn_lock(p->p_textvp, LK_EXCLUSIVE | LK_RETRY, td); - vn_fullpath(td, p->p_textvp, &fullpath, &freepath); - VOP_UNLOCK(p->p_textvp, 0, td); + textvp = p->p_textvp; + vn_lock(textvp, LK_EXCLUSIVE | LK_RETRY, td); + vn_fullpath(td, textvp, &fullpath, &freepath); + VOP_UNLOCK(textvp, 0, td); sbuf_printf(sb, "%s", fullpath); if (freepath) free(freepath, M_TEMP);
pgpcPD4hh3Rhl.pgp
Description: PGP signature