Hi, Recently my colleague ran into a kernel crash problem when he tried to assign 4 nics to 4 vms separately. Unfortunately he didn't collect related logs and we only can see the dmesg log when core dump currently.
Here the info: linux:~ # lspci | grep -i eth 02:00.0 Ethernet controller: Broadcom Limited NetXtreme BCM5719 Gigabit Ethernet PCIe (rev 01) 02:00.1 Ethernet controller: Broadcom Limited NetXtreme BCM5719 Gigabit Ethernet PCIe (rev 01) 02:00.2 Ethernet controller: Broadcom Limited NetXtreme BCM5719 Gigabit Ethernet PCIe (rev 01) 02:00.3 Ethernet controller: Broadcom Limited NetXtreme BCM5719 Gigabit Ethernet PCIe (rev 01) 81:00.0 Ethernet controller: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection (rev 01) 81:00.1 Ethernet controller: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection (rev 01) 82:00.0 Ethernet controller: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection (rev 01) 82:00.1 Ethernet controller: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection (rev 01) He used the last four nics. Dmesg: [ 3449.519354] general protection fault: 0000 [#1] SMP [ 3449.682056] CPU: 8 PID: 26794 Comm: qemu-kvm Tainted: G OE ---- ------- 3.10.0-514.44.5.10_44.x86_64 #1 [ 3449.692900] Hardware name: Huawei RH2288H V3/BC11HGSA0, BIOS 3.87 02/02/2018 [ 3449.700115] task: ffff880e63172f10 ti: ffff880de5424000 task.ti: ffff880de5424000 [ 3449.707932] RIP: 0010:[<ffffffff8156a6fc>] [<ffffffff8156a6fc>] domain_remove_one_dev_info+0x9c/0x250 [ 3449.717586] RSP: 0018:ffff880de5427c88 EFLAGS: 00010093 [ 3449.723064] RAX: 0000000000000246 RBX: dead000000000100 RCX: ffff88203e49c258 [ 3449.730359] RDX: dead000000000100 RSI: 0000000000000001 RDI: ffff88203e46da40 [ 3449.737656] RBP: ffff880de5427cd8 R08: 0000000000000001 R09: 000000018040003c [ 3449.744954] R10: 000000003e46da01 R11: ffffea0080f91b40 R12: ffff88203e46da40 [ 3449.752250] R13: ffff88203e49c240 R14: ffff88203ebe3098 R15: ffff88017fd17200 [ 3449.759548] FS: 00007fb9304e6c00(0000) GS:ffff88203f280000(0000) knlGS:0000000000000000 [ 3449.767966] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 3449.773877] CR2: 00000000004bc020 CR3: 0000001de0324000 CR4: 00000000001627e0 [ 3449.781173] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 3449.788469] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 3449.795768] Call Trace: [ 3449.798387] [<ffffffff8156d6b9>] intel_iommu_attach_device+0x209/0x240 [ 3449.805159] [<ffffffff8155beb1>] __iommu_attach_device+0x21/0x80 [ 3449.811418] [<ffffffff8155d174>] __iommu_attach_group+0x54/0x80 [ 3449.817590] [<ffffffff8155d1cb>] iommu_attach_group+0x2b/0x40 [ 3449.823592] [<ffffffffa0222268>] vfio_iommu_type1_attach_group+0x1c8/0x652 [vfio_iommu_type1] [ 3449.832542] [<ffffffffa01f4aea>] vfio_fops_unl_ioctl+0x1ba/0x300 [vfio] [ 3449.839398] [<ffffffff81229cd8>] do_vfs_ioctl+0x2e8/0x4d0 [ 3449.845046] [<ffffffff81234c07>] ? __fd_install+0x47/0x60 [ 3449.850686] [<ffffffff81229f61>] SyS_ioctl+0xa1/0xc0 [ 3449.855908] [<ffffffff816c22ef>] system_call_fastpath+0x1c/0x21 [ 3449.862075] Code: 39 cb 48 8b 13 48 89 df 74 2d 49 89 dc 48 89 d3 4d 39 7c 24 30 75 e8 0f b6 75 ce 41 38 74 24 20 74 5d 48 39 cb 41 b8 01 00 00 00 <48> 8b 13 48 89 df 75 d7 0f 1f 40 00 48 89 c6 48 c7 c7 80 d0 fd [ 3449.882792] RIP [<ffffffff8156a6fc>] domain_remove_one_dev_info+0x9c/0x250 [ 3449.889940] RSP <ffff880de5427c88> [ 3449.894065] ---[ end trace e389931a63bcab52 ]--- [ 3450.471060] Kernel panic - not syncing: Fatal exception [ 3451.512156] Shutting down cpus with NMI [ 3452.068904] die even has been record! Any suggestion will be appreciated! Thanks, Zongyong Wu
_______________________________________________ vfio-users mailing list vfio-users@redhat.com https://www.redhat.com/mailman/listinfo/vfio-users