https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=229167
--- Comment #37 from Wei Hu <w...@microsoft.com> --- I hit this bug on FreeBSD 12.1 image in Azure with Mellanox CX3 VF. Looks the root file system is not available at the time mlx4 driver is trying to load mlx4en kernel module. Adding one second sleep in mlx4_request_modules makes the problem go away. But it doesn't look like a fix to me. At least namei() or vrefact() should check if the vnode is NULL to avoid the panic. Here is the detailed troubleshooting in debugger I did when the crash happened. The panic on console: ---------------------- pci1: <PCI bus> on pcib1 mlx4_core0: <mlx4_core> at device 2.0 on pci1 <6>mlx4_core: Mellanox ConnectX core driver v3.5.1 (April 2019) mlx4_core: Initializing mlx4_core mlx4_core0: Detected virtual function - running in slave mode mlx4_core0: Sending reset mlx4_core0: Sending vhcr0 mlx4_core0: HCA minimum page size:512 mlx4_core0: Timestamping is not supported in slave mode mlx4_en mlx4_core0: Activating port:1 mlxen0: Ethernet address: 00:0d:3a:e8:16:18 <4>mlx4_en: mlx4_core0: Port 1: Using 4 TX rings mlxen0: link state changed to DOWN <4>mlx4_en: mlx4_core0: Port 1: Using 4 RX rings <4>mlx4_en: mlxen0: Using 4 TX rings hn0: link state changed to DOWN <4>mlx4_en: mlxen0: Using 4 RX rings <4>mlx4_en: mlxen0: Initializing port mlx4_core0: About to load mlx4_en Fatal trap 12: page fault while in kernel mode cpuid = 2; apic id = 02 fault virtual address = 0x1d8 <- 0x1d8 is the offset of (struct vnode *)->v_type fault code = supervisor read data, page not present instruction pointer = 0x20:0xffffffff80cb5c34 stack pointer = 0x28:0xfffffe00004f4960 frame pointer = 0x28:0xfffffe00004f4960 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 0 (vmbusdev) trap number = 12 panic: page fault cpuid = 2 time = 1599838711 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe00004f4610 vpanic() at vpanic+0x19d/frame 0xfffffe00004f4660 panic() at panic+0x43/frame 0xfffffe00004f46c0 trap_fatal() at trap_fatal+0x39c/frame 0xfffffe00004f4720 trap_pfault() at trap_pfault+0x49/frame 0xfffffe00004f4780 trap() at trap+0x29f/frame 0xfffffe00004f4890 calltrap() at calltrap+0x8/frame 0xfffffe00004f4890 --- trap 0xc, rip = 0xffffffff80cb5c34, rsp = 0xfffffe00004f4960, rbp = 0xfffffe00004f4960 --- vrefact() at vrefact+0x4/frame 0xfffffe00004f4960 namei() at namei+0x172/frame 0xfffffe00004f4a20 vn_open_cred() at vn_open_cred+0x221/frame 0xfffffe00004f4b70 linker_load_module() at linker_load_module+0x480/frame 0xfffffe00004f4e90 kern_kldload() at kern_kldload+0xc3/frame 0xfffffe00004f4ee0 mlx4_request_modules() at mlx4_request_modules+0xc2/frame 0xfffffe00004f4fa0 mlx4_load_one() at mlx4_load_one+0x349c/frame 0xfffffe00004f5660 mlx4_init_one() at mlx4_init_one+0x3f0/frame 0xfffffe00004f56b0 linux_pci_attach() at linux_pci_attach+0x432/frame 0xfffffe00004f5710 device_attach() at device_attach+0x3e1/frame 0xfffffe00004f5760 bus_generic_attach() at bus_generic_attach+0x5c/frame 0xfffffe00004f5790 pci_attach() at pci_attach+0xd5/frame 0xfffffe00004f57d0 device_attach() at device_attach+0x3e1/frame 0xfffffe00004f5820 bus_generic_attach() at bus_generic_attach+0x5c/frame 0xfffffe00004f5850 vmbus_pcib_attach() at vmbus_pcib_attach+0x75e/frame 0xfffffe00004f5930 device_attach() at device_attach+0x3e1/frame 0xfffffe00004f5980 device_probe_and_attach() at device_probe_and_attach+0x42/frame 0xfffffe00004f59b0 vmbus_add_child() at vmbus_add_child+0x7b/frame 0xfffffe00004f59e0 taskqueue_run_locked() at taskqueue_run_locked+0x154/frame 0xfffffe00004f5a40 taskqueue_thread_loop() at taskqueue_thread_loop+0x98/frame 0xfffffe00004f5a70 fork_exit() at fork_exit+0x83/frame 0xfffffe00004f5ab0 fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe00004f5ab0 --- trap 0, rip = 0, rsp = 0, rbp = 0 --- KDB: enter: panic [ thread pid 0 tid 100080 ] Stopped at kdb_enter+0x3b: movq $0,kdb_why db> x/i vrefact,10 vrefact: pushq %rbp vrefact+0x1: movq %rsp,%rbp vrefact+0x4: cmpl $0x4,ll+0x1b7(%rdi) <-- if (__predict_false(vp->v_type == VCHR)) in vrefact() vrefact+0xb: jz vrefact+0x1f vrefact+0xd: lock addl $0x1,ll+0x19b(%rdi) vrefact+0x15: lock addl $0x1,ll+0x19f(%rdi) vrefact+0x1d: popq %rbp vrefact+0x1e: ret db> x/i namei+0x160,10 namei+0x160: call _sx_slock_int namei+0x165: movq 0x10(%r13),%rdi namei+0x169: movq %rdi,ll+0x7(%rbx) namei+0x16d: call vrefact <--- Place in namei() calling vrefact() namei+0x172: movq 0x18(%r13),%rax namei+0x176: movq %rax,ll+0xf(%rbx) namei+0x17a: movq ll+0x77(%rbx),%rax This is the code in namei(): /* * Get starting point for the translation. */ FILEDESC_SLOCK(fdp); ndp->ni_rootdir = fdp->fd_rdir; vrefact(ndp->ni_rootdir); <--- here ndp->ni_topdir = fdp->fd_jdir; And fdp is (struct filedesc *) and got assigned earlier to the current proc's p_fd: p = td->td_proc; ... fdp = p->p_fd; db> show thread Thread 100080 at 0xfffff80004a95000: proc (pid 0): 0xffffffff81ff2060 <--- pointer to proc name: vmbusdev stack: 0xfffffe00004f2000-0xfffffe00004f5fff flags: 0x4 pflags: 0x200000 state: RUNNING (CPU 2) priority: 8 container lock: sched lock 2 (0xffffffff81eb3540) last voluntary switch: 50 ms ago db> x/gx 0xffffffff81ff2060,20 (struct proc) proc0: 0 fffff80003609a60 ffffffff81ff25a0 proc0+0x18: fffff80009866010 ffffffff81332c23 proc0+0x28: 30000 0 proc0+0x38: 0 fffff8000308ad00 proc0+0x48: fffff800035168a0 (p_fd) 0 proc0+0x58: fffff80003084e00 fffff8000308ac00 db> x/gx 0xfffff800035168a0, 10 (struct filedesc) 0xfffff800035168a0: fffff80003516920 0 0xfffff800035168b0: 0 (fd_rdir) 0 0xfffff800035168c0: fffff80003516ce8 ffffffff 0xfffff800035168d0: 100000012 1 0xfffff800035168e0: ffffffff812c4e35 2330000 0xfffff800035168f0: 0 21 0xfffff80003516900: 0 0 0xfffff80003516910: 0 0 So at the moment the fd_rdir (root directory) is still NULL. -- You are receiving this mail because: You are the assignee for the bug. _______________________________________________ freebsd-virtualization@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-virtualization To unsubscribe, send any mail to "freebsd-virtualization-unsubscr...@freebsd.org"