On Fri, Mar 06, 2009 at 05:01:12PM -0500, Boris Kochergin wrote: > Gavin Atkinson wrote: > >On Thu, 2009-03-05 at 19:55 -0500, Boris Kochergin wrote: > > > >>Ahoy. I recently upgraded an amd64 machine to 7.1-RELEASE, and started > >>getting a bunch of these at a pretty high frequency (a few hours to a > >>day apart): > >> > >>http://acm.poly.edu/~spawk/IMG00033.jpg > >> > >>The "current process" is always httpd. They're particularly annoying > >>because the machine doesn't actually ever reboot, requiring manual > >>intervention. Reverting the kernel back to 7.0 makes the panic go away, > >>and the machine had been happily running 7.0 for about a year > >>beforehand. I realize that the photo hardly contains any useful > >>debugging information, but I was hoping it might look familiar to > >>someone. If not, I guess I'll come back with a backtrace. > >> > > > >A backtrace will almost certainly be necessary to figure out what this > >issue is, although there is a possibility that the output of > >"addr2line -e /boot/kernel/kernel.symbols 0x8:0xffffffff802d7010" > >might help, assuming you've not recompiled your kernel yet. (That > >number should be the same as the "instruction pointer" shown by the > >panic, but as the photo is quite blurred there's a chance I've got it > >wrong, if you have a better picture of it or wrote it down then use > >that) > > > >Gavin > >_______________________________________________ > >freebsd-stable@freebsd.org mailing list > >http://lists.freebsd.org/mailman/listinfo/freebsd-stable > >To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org" > > > Here it is, with some additional information afterward: > > Unread portion of the kernel message buffer: > kernel trap 12 with interrupts disabled > > > Fatal trap 12: page fault while in kernel mode > cpuid = 1; apic id = 01 > fault virtual address = 0x30 > fault code = supervisor read data, page not present > instruction pointer = 0x8:0xffffffff80293faf > stack pointer = 0x10:0xffffffff9cbaea70 > frame pointer = 0x10:0xffffff000fc14000 > code segment = base 0x0, limit 0xfffff, type 0x1b > = DPL 0, pres 1, long 1, def32 0, gran 1 > processor eflags = resume, IOPL = 0 > current process = 881 (httpd) > trap number = 12 > panic: page fault > cpuid = 1 > Uptime: 1m51s > Physical memory: 8185 MB > Dumping 328 MB: 313 297 281 265 249 233 217 201 185 169 153 137 121 105 > 89 73 57 41 25 9 > > #0 doadump () at pcpu.h:195 > 195 pcpu.h: No such file or directory. > in pcpu.h > (kgdb) where > #0 doadump () at pcpu.h:195 > #1 0xffffff000fc14000 in ?? () > #2 0xffffffff8025eba9 in boot (howto=260) at > /usr/src-7.1/sys/kern/kern_shutdown.c:418 > #3 0xffffffff8025efb2 in panic (fmt=0x104 <Address 0x104 out of > bounds>) at /usr/src-7.1/sys/kern/kern_shutdown.c:574 > #4 0xffffffff803df5c3 in trap_fatal (frame=0xffffff000fc14000, > eva=Variable "eva" is not available. > ) at /usr/src-7.1/sys/amd64/amd64/trap.c:764 > #5 0xffffffff803e018f in trap (frame=0xffffffff9cbae9c0) at > /usr/src-7.1/sys/amd64/amd64/trap.c:290 > #6 0xffffffff803c5c4e in calltrap () at > /usr/src-7.1/sys/amd64/amd64/exception.S:209 > #7 0xffffffff80293faf in turnstile_broadcast (ts=0x0, queue=0) at > /usr/src-7.1/sys/kern/subr_turnstile.c:836 > #8 0xffffffff8025256a in _mtx_unlock_sleep (m=0xffffffff80593538, > opts=Variable "opts" is not available. > ) at /usr/src-7.1/sys/kern/kern_mutex.c:619 > #9 0xffffffff80275ed3 in __umtx_op_cv_wait (td=0x1ee, uap=Variable > "uap" is not available. > ) at /usr/src-7.1/sys/kern/kern_umtx.c:312 > #10 0xffffffff803dfb78 in syscall (frame=0xffffffff9cbaec80) at > /usr/src-7.1/sys/amd64/amd64/trap.c:907 > #11 0xffffffff803c5e5b in Xfast_syscall () at > /usr/src-7.1/sys/amd64/amd64/exception.S:330 > #12 0x0000000800f5354c in ?? () > Previous frame inner to this frame (corrupt stack?) > (kgdb) > > The dump was difficult to acquire--the system would often lock up after > dumping only a portion of the memory it wanted to save. I can also now > trigger the panic pretty reliably using this bit of script: > > #!/usr/local/bin/bash > > for i in {1..900} > do > wget --quiet -O /dev/null http://acm.poly.edu/wiki/Hosting & > done > > ...where the URL is a MediaWiki installation on the afflicted machine.
Can you, please, recompile the kernel with debugging options, and provoke the panic on it ? We need at least options INVARIANTS, INVARIANT_SUPPORT and WITNESS.
pgpRs7poemfsA.pgp
Description: PGP signature