On 25/09/2021 19:25, Andriy Gapon wrote:
On 13/06/2021 11:19, Kristof Provost wrote:
On 13 Jun 2021, at 09:41, Andriy Gapon <a...@freebsd.org> wrote:
Based on
the panic message (page fault with non-sleepable locks held), it seems that
the problem is with holding the lock across the copyout.  Usually that
won't panic, but if the destination happens to be paged out... And only
with INVARIANTS, I guess...

Oh right. Thanks. I’ve gotten bitten by that one before, but had clearly
garbage collected the memory.

I’ll fix this one and check for others on Monday.

I’ll also see of we can persuade copyout to always panic on this bug, not
just when the destination memory is actually paged out. That way we’ll catch
this in the regression tests in the future.

I upgraded to the latest stable/13 and hit a fresh panic of the same type.
This time it's in pf_getstatus() and it's a copyout while 'pf rulesets' lock is held.


<118>Enabling pf
Kernel page fault with the following non-sleepable locks held:
shared rm pf rulesets (pf rulesets) r = 0 (0xffffffff85764020) locked @ /usr/devel/git/trant/sys/netpfil/pf/pf_ioctl.c:4945
stack backtrace:
#0 0xffffffff808cb43d at witness_debugger+0x6d
#1 0xffffffff808cc2ab at witness_warn+0x21b
#2 0xffffffff80b567f1 at trap_pfault+0x71
#3 0xffffffff80b55df8 at trap+0x288
#4 0xffffffff80b56b59 at trap_check+0x29
#5 0xffffffff80b32298 at calltrap+0x8
#6 0xffffffff8574cae8 at pf_getstatus+0x548
#7 0xffffffff85747430 at pfioctl+0x2590
#8 0xffffffff8073854f at devfs_ioctl+0xcf
#9 0xffffffff80bd8c26 at VOP_IOCTL_APV+0x96
#10 0xffffffff8094c424 at VOP_IOCTL+0x34
#11 0xffffffff80947600 at vn_ioctl+0xc0
#12 0xffffffff80738a3e at devfs_ioctl_f+0x1e
#13 0xffffffff808cf8fb at fo_ioctl+0xb
#14 0xffffffff808cf897 at kern_ioctl+0x1d7
#15 0xffffffff808cf60d at sys_ioctl+0x12d
#16 0xffffffff80b57353 at syscallenter+0x163
#17 0xffffffff80b57025 at amd64_syscall+0x15

Hmm, there is more:

Fatal trap 12: page fault while in kernel mode
cpuid = 2; apic id = 02
fault virtual address   = 0x800a0e000
fault code              = supervisor write data, page not present
instruction pointer     = 0x20:0xffffffff80b52e8a
stack pointer           = 0x28:0xfffffe01b45d0150
frame pointer           = 0x28:0xfffffe01b45d0150
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 1436 (pfctl)
trap number             = 12
panic: page fault
cpuid = 2
time = 1632573676
KDB: stack backtrace:
db_trace_self_wrapper() at 0xffffffff805c9eeb = db_trace_self_wrapper+0x2b/frame 0xfffffe01b45cfd10
kdb_backtrace() at 0xffffffff808aafe7 = kdb_backtrace+0x37/frame 
0xfffffe01b45cfdc0
vpanic() at 0xffffffff8086788c = vpanic+0x18c/frame 0xfffffe01b45cfe20
panic() at 0xffffffff808674a3 = panic+0x43/frame 0xfffffe01b45cfe80
trap_fatal() at 0xffffffff80b56725 = trap_fatal+0x375/frame 0xfffffe01b45cfee0
trap_pfault() at 0xffffffff80b56800 = trap_pfault+0x80/frame 0xfffffe01b45cff50
trap() at 0xffffffff80b55df8 = trap+0x288/frame 0xfffffe01b45d0060
trap_check() at 0xffffffff80b56b59 = trap_check+0x29/frame 0xfffffe01b45d0080
calltrap() at 0xffffffff80b32298 = calltrap+0x8/frame 0xfffffe01b45d0080
--- trap 0xc, rip = 0xffffffff80b52e8a, rsp = 0xfffffe01b45d0150, rbp = 0xfffffe01b45d0150 --- copyout_nosmap_std() at 0xffffffff80b52e8a = copyout_nosmap_std+0x15a/frame 0xfffffe01b45d0150
pf_getstatus() at 0xffffffff8574cae8 = pf_getstatus+0x548/frame 
0xfffffe01b45d0480
pfioctl() at 0xffffffff85747430 = pfioctl+0x2590/frame 0xfffffe01b45d0930
devfs_ioctl() at 0xffffffff8073854f = devfs_ioctl+0xcf/frame 0xfffffe01b45d0990
VOP_IOCTL_APV() at 0xffffffff80bd8c26 = VOP_IOCTL_APV+0x96/frame 
0xfffffe01b45d09b0
VOP_IOCTL() at 0xffffffff8094c424 = VOP_IOCTL+0x34/frame 0xfffffe01b45d0a00
vn_ioctl() at 0xffffffff80947600 = vn_ioctl+0xc0/frame 0xfffffe01b45d0af0
devfs_ioctl_f() at 0xffffffff80738a3e = devfs_ioctl_f+0x1e/frame 
0xfffffe01b45d0b10
fo_ioctl() at 0xffffffff808cf8fb = fo_ioctl+0xb/frame 0xfffffe01b45d0b20
kern_ioctl() at 0xffffffff808cf897 = kern_ioctl+0x1d7/frame 0xfffffe01b45d0b80
sys_ioctl() at 0xffffffff808cf60d = sys_ioctl+0x12d/frame 0xfffffe01b45d0c50
syscallenter() at 0xffffffff80b57353 = syscallenter+0x163/frame 
0xfffffe01b45d0ca0
amd64_syscall() at 0xffffffff80b57025 = amd64_syscall+0x15/frame 
0xfffffe01b45d0d30
fast_syscall_common() at 0xffffffff80b32bab = fast_syscall_common+0xf8/frame 0xfffffe01b45d0d30 --- syscall (54, FreeBSD ELF64, sys_ioctl), rip = 0x80042adaa, rsp = 0x7fffffffe5f8, rbp = 0x7fffffffe650 ---
Uptime: 1m10s
Dumping 1462 out of 32644 MB:..2%..11%..21%..31%..41%..51%..61%..71%..81%..91%


Unfortunately kgdb itself crashes when trying to examine the dump.
I think it's strange that copyout_nosmap_std() crashes with a page fault apparently when writing to a userland address.

--
Andriy Gapon

Reply via email to