Thanks for looking into this.

On 10/14/2015 4:42 PM, Konstantin Belousov wrote:
On Wed, Oct 14, 2015 at 03:52:47PM +0200, Frank Razenberg wrote:
After upgrading from 9.2 to 10.1 I first started noticing panics. They
occurred roughly weekly and since this storage machine isn't frequently
used I didn't look into it much further. After updating for 10.2-STABLE
the panics have gone from weekly to daily.
The machine has 32GB of non-registered ECC DDR3-1066 RAM. There's also a
10-disk raidz2 pool. I've ran memtest86+ for 72 hours straight with no
errors.

Crash dumps all feature the following:

Fatal trap 12: page fault while in kernel mode
cpuid = 2; apic id = 12
fault virtual address   = 0x1d1c0bec0
fault code              = supervisor read data, page not present
instruction pointer     = 0x20:0xffffffff804fda65
stack pointer           = 0x28:0xfffffe0698f21870
frame pointer           = 0x28:0xfffffe0698f218d0
code segment            = base 0x0, limit 0xfffff, type 0x1b
                          = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 6106 (pickup)
trap number             = 12
panic: page fault
cpuid = 2


(kgdb) bt
#0  doadump (textdump=<value optimized out>) at pcpu.h:219
#1  0xffffffff8053ce32 in kern_reboot (howto=260) at
/usr/src/sys/kern/kern_shutdown.c:455
#2  0xffffffff8053d215 in vpanic (fmt=<value optimized out>, ap=<value
optimized out>) at /usr/src/sys/kern/kern_shutdown.c:762
#3  0xffffffff8053d0a3 in panic (fmt=0x0) at
/usr/src/sys/kern/kern_shutdown.c:691
#4  0xffffffff807755db in trap_fatal (frame=<value optimized out>,
eva=<value optimized out>) at /usr/src/sys/amd64/amd64/trap.c:851
#5  0xffffffff807758dd in trap_pfault (frame=0xfffffe0698dbc7c0,
usermode=<value optimized out>) at /usr/src/sys/amd64/amd64/trap.c:674
#6  0xffffffff80774f7a in trap (frame=0xfffffe0698dbc7c0) at
/usr/src/sys/amd64/amd64/trap.c:440
#7  0xffffffff8075b0f2 in calltrap () at
/usr/src/sys/amd64/amd64/exception.S:236
#8  0xffffffff804fda65 in kqueue_close (fp=0xfffff803e4967190,
td=0xfffff80014b094a0) at /usr/src/sys/kern/kern_event.c:1750
#9  0xffffffff804f25f9 in _fdrop (fp=0xfffff803e4967190,
td=0xfffff802b5d2a000) at file.h:343
#10 0xffffffff804f4e9e in closef (fp=<value optimized out>, td=<value
optimized out>) at /usr/src/sys/kern/kern_descrip.c:2338
#11 0xffffffff804f4ab9 in fdescfree (td=0xfffff80014b094a0) at
/usr/src/sys/kern/kern_descrip.c:2106
#12 0xffffffff805013a9 in exit1 (td=0xfffff80014b094a0, rv=<value
optimized out>) at /usr/src/sys/kern/kern_exit.c:369
#13 0xffffffff80500e3e in sys_sys_exit (td=0xfffffe000782e060,
uap=<value optimized out>) at /usr/src/sys/kern/kern_exit.c:179
#14 0xffffffff80775efd in amd64_syscall (td=0xfffff80014b094a0,
traced=0) at subr_syscall.c:134
#15 0xffffffff8075b3db in Xfast_syscall () at
/usr/src/sys/amd64/amd64/exception.S:396
#16 0x000000080120335a in ?? ()

Most of the dumps list 'pickup' as current process. All of them have
'kqueue_close' in the backtrace.
I'm not sure what the next step in diagnosing the issue is. Any pointers
would be greatly appreciated.
What is exact revision of the checkout you run, where the panic above
occurs ?
Not entirely sure. Can I still find out if I've updated my source tree since? It's not in uname -a, but matching the dates it should be around ~289032.
Want me to update to HEAD and do the steps below on that instead?


Please load the kernel.debug + vmcore into kgdb, go to frame 8, and do
p *kq
p *kn
p i
p kq->kq_knlist[i].slh_first
p *(kq->kq_knlist[i].slh_first)
#8 0xffffffff804fda65 in kqueue_close (fp=0xfffff801dd94b1e0, td=0xfffff80015bbc000) at /usr/src/sys/kern/kern_event.c:1750
1750 kn->kn_fop->f_detach(kn);
(kgdb) p *kq
$1 = {kq_lock = {lock_object = {lo_name = 0xffffffff80829725 "kqueue", lo_flags = 21168128, lo_data = 0, lo_witness = 0x0}, mtx_lock = 4}, kq_refcnt = 1, kq_list = { tqe_next = 0xfffff8015f29fc00, tqe_prev = 0xfffff8000c749860}, kq_head = {tqh_first = 0x0, tqh_last = 0xfffff801dd33a038}, kq_count = 0, kq_sel = {si_tdlist = {tqh_first = 0x0, tqh_last = 0x0}, si_note = {kl_list = {slh_first = 0x0}, kl_lock = 0xffffffff804fc560 <knlist_mtx_lock>, kl_unlock = 0xffffffff804fc5a0 <knlist_mtx_unlock>, kl_assert_locked = 0xffffffff804fc5e0 <knlist_mtx_assert_locked>, kl_assert_unlocked = 0xffffffff804fc5f0 <knlist_mtx_assert_unlocked>, kl_lockarg = 0xfffff801dd33a000}, si_mtx = 0x0}, kq_sigio = 0x0, kq_fdp = 0xfffff8000c749800, kq_state = 16, kq_knlistsize = 256, kq_knlist = 0xfffff8000c7a8800, kq_knhashmask = 0, kq_knhash = 0x0, kq_task = { ta_link = {stqe_next = 0x0}, ta_pending = 0, ta_priority = 0, ta_func = 0xffffffff804faeb0 <kqueue_task>, ta_context = 0xfffff801dd33a000}}
(kgdb) p *kn
No symbol "kn" in current context.
(kgdb) p i
No symbol "i" in current context.


_______________________________________________
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Reply via email to