On Feb 17, 2012, at 8:27 PM, Konstantin Belousov wrote: > On Thu, Feb 16, 2012 at 12:07:46PM -0500, Paul Mather wrote: >> On Feb 16, 2012, at 10:49 AM, Konstantin Belousov wrote: >> >>> On Thu, Feb 16, 2012 at 10:09:27AM -0500, Paul Mather wrote: >>>> On Feb 14, 2012, at 7:47 PM, Konstantin Belousov wrote: >>>> >>>>> On Tue, Feb 14, 2012 at 09:38:18AM -0500, Paul Mather wrote: >>>>>> I have a problem with RELENG_8 (FreeBSD/amd64 running a GENERIC kernel, >>>>>> last built 2012-02-08). It will panic during the daily periodic scripts >>>>>> that run at 3am. Here is the most recent panic message: >>>>>> >>>>>> Fatal trap 9: general protection fault while in kernel mode >>>>>> cpuid = 0; apic id = 00 >>>>>> instruction pointer = 0x20:0xffffffff8069d266 >>>>>> stack pointer = 0x28:0xffffff8094b90390 >>>>>> frame pointer = 0x28:0xffffff8094b903a0 >>>>>> code segment = base 0x0, limit 0xfffff, type 0x1b >>>>>> = DPL 0, pres 1, long 1, def32 0, gran 1 >>>>>> processor eflags = resume, IOPL = 0 >>>>>> current process = 72566 (ps) >>>>>> trap number = 9 >>>>>> panic: general protection fault >>>>>> cpuid = 0 >>>>>> KDB: stack backtrace: >>>>>> #0 0xffffffff8062cf8e at kdb_backtrace+0x5e >>>>>> #1 0xffffffff805facd3 at panic+0x183 >>>>>> #2 0xffffffff808e6c20 at trap_fatal+0x290 >>>>>> #3 0xffffffff808e715a at trap+0x10a >>>>>> #4 0xffffffff808cec64 at calltrap+0x8 >>>>>> #5 0xffffffff805ee034 at fill_kinfo_thread+0x54 >>>>>> #6 0xffffffff805eee76 at fill_kinfo_proc+0x586 >>>>>> #7 0xffffffff805f22b8 at sysctl_out_proc+0x48 >>>>>> #8 0xffffffff805f26c8 at sysctl_kern_proc+0x278 >>>>>> #9 0xffffffff8060473f at sysctl_root+0x14f >>>>>> #10 0xffffffff80604a2a at userland_sysctl+0x14a >>>>>> #11 0xffffffff80604f1a at __sysctl+0xaa >>>>>> #12 0xffffffff808e62d4 at amd64_syscall+0x1f4 >>>>>> #13 0xffffffff808cef5c at Xfast_syscall+0xfc >>>>> >>>>> Please look up the line number for the fill_kinfo_thread+0x54. >>>> >>>> >>>> Is there a way for me to do this from the above information? As >>>> I said in the original message, I failed to get a crash dump after >>>> reboot (because, it turns out, I hadn't set up my gmirror swap device >>>> properly). Alas, with the latest panic, it appears to have hung[1] >>>> during the "Dumping" phase, so it looks like I won't get a saved crash >>>> dump this time, either. :-( >>> >>> Load the kernel.debug into kgdb, and from there do >>> "list *fill_kinfo_thread+0x54". >> >> >> gromit# kgdb /usr/obj/usr/src/sys/GENERIC/kernel.debug >> GNU gdb 6.1.1 [FreeBSD] >> Copyright 2004 Free Software Foundation, Inc. >> GDB is free software, covered by the GNU General Public License, and you are >> welcome to change it and/or distribute copies of it under certain conditions. >> Type "show copying" to see the conditions. >> There is absolutely no warranty for GDB. Type "show warranty" for details. >> This GDB was configured as "amd64-marcel-freebsd"... >> (kgdb) list *fill_kinfo_thread+0x54 >> 0xffffffff805ee034 is in fill_kinfo_thread >> (/usr/src/sys/kern/kern_proc.c:854). >> 849 thread_lock(td); >> 850 if (td->td_wmesg != NULL) >> 851 strlcpy(kp->ki_wmesg, td->td_wmesg, >> sizeof(kp->ki_wmesg)); >> 852 else >> 853 bzero(kp->ki_wmesg, sizeof(kp->ki_wmesg)); >> 854 strlcpy(kp->ki_ocomm, td->td_name, sizeof(kp->ki_ocomm)); >> 855 if (TD_ON_LOCK(td)) { >> 856 kp->ki_kiflag |= KI_LOCKBLOCK; >> 857 strlcpy(kp->ki_lockname, td->td_lockname, >> 858 sizeof(kp->ki_lockname)); >> (kgdb) > > This is indeed strange. It can only occur if td pointer is damaged. > > Please, try to get a core and at least print the content of *td in this case.
Another panic last night, after reverting "dsmc schedule" scripts to use "/bin/sh" (actually /compat/linux/bin/sh): Fatal trap 12: page fault while in kernel mode cpuid = 1; apic id = 01 fault virtual address = 0x308 fault code = supervisor read data, page not present instruction pointer = 0x20:0xffffffff806026ef stack pointer = 0x28:0xffffff8094af02d0 frame pointer = 0x28:0xffffff8094af0350 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 90872 (df) trap number = 12 panic: page fault cpuid = 1 KDB: stack backtrace: #0 0xffffffff8062cf8e at kdb_backtrace+0x5e #1 0xffffffff805facd3 at panic+0x183 #2 0xffffffff808e6c20 at trap_fatal+0x290 #3 0xffffffff808e6f71 at trap_pfault+0x201 #4 0xffffffff808e742f at trap+0x3df #5 0xffffffff808cec64 at calltrap+0x8 #6 0xffffffff80602e1e at _sx_xlock+0x4e #7 0xffffffff80f9ca35 at rrw_enter+0xa5 #8 0xffffffff80f9ce86 at zfs_statfs+0x46 #9 0xffffffff80681258 at __vfs_statfs+0x28 #10 0xffffffff81476521 at nullfs_statfs+0x51 #11 0xffffffff80681258 at __vfs_statfs+0x28 #12 0xffffffff80690b22 at kern_statfs+0x1b2 #13 0xffffffff80690c77 at statfs+0x37 #14 0xffffffff808e62d4 at amd64_syscall+0x1f4 #15 0xffffffff808cef5c at Xfast_syscall+0xfc Alas, the system became "hung" here: there is no further output indicating memory being dumped to the dumpdev and no core dump was found during subsequent (forced) reboot. :-( Note that this panic is different to the previous one. Also, the presence of nullfs_statfs in the backtrace above is very curious. According to my logs, the daily backup had already finished successfully and thus the nullfs-mounted file systems would have been unmounted before the system panicked: 02/22/12 02:08:11 --- SCHEDULEREC STATUS END 02/22/12 02:08:11 --- SCHEDULEREC OBJECT END DESKTOP_DAILY_BACKUP 02/22/12 02:00:00 02/22/12 02:08:11 Executing Operating System command or script: /usr/local/bin/remove_zfs_backup_snapshot 02/22/12 02:08:12 Finished command. Return code is: 0 02/22/12 02:08:12 Scheduled event 'DESKTOP_DAILY_BACKUP' completed successfully. 02/22/12 02:08:12 Sending results for scheduled event 'DESKTOP_DAILY_BACKUP'. 02/22/12 02:08:12 Results sent to server for scheduled event 'DESKTOP_DAILY_BACKUP'. 02/22/12 02:08:12 ANS1483I Schedule log pruning started. 02/22/12 02:08:15 ANS1484I Schedule log pruning finished successfully. Other logs indicate that the system was up until 3 am, whereupon, presumably, "periodic daily" precipitated a panic somewhere during its execution. Cheers, Paul. _______________________________________________ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"