PCI devices needing a ROM allocate an optional MemoryRegion with pci_add_option_rom(). pci_del_option_rom() does the cleanup when the device is destroyed. The only action taken by this routine is to call vmstate_unregister_ram() which clears the id string of the optional ROM RAMBlock and now, also flags the RAMBlock as non-migratable. This was recently added by commit b895de502717 ("migration: discard non-migratable RAMBlocks"), .
VFIO devices do their own loading of the PCI option ROM in vfio_pci_size_rom(). The memory region is switched to an I/O region and the PCI attribute 'has_rom' is set but the RAMBlock of the ROM region is not allocated. When the associated PCI device is deleted, pci_del_option_rom() calls vmstate_unregister_ram() which tries to flag a NULL RAMBlock, leading to a SEGV. It seems that 'has_rom' was set to have memory_region_destroy() called, but since commit 469b046ead06 ("memory: remove memory_region_destroy") this is not necessary anymore as the MemoryRegion is freed automagically. Remove the PCIDevice 'has_rom' attribute setting in vfio. Signed-off-by: Cédric Le Goater <c...@kaod.org> --- Tested on a KVM POWER9 pseries machine and a Mellanox MT27710 Ethernet controller. Performed a couple of plug/unplug, migrated, and did a couple more unplug/plug before powering off. The same tests were done with the previous patches which were addressing the issue at a different level : 1. [PATCH] exec.c: check RAMBlock validity before changing its flag https://lists.gnu.org/archive/html/qemu-devel/2018-07/msg00009.html 2. [PATCH] pci: remove pci_del_option_rom() https://lists.gnu.org/archive/html/qemu-devel/2018-07/msg01651.html Do we still want to remove pci_del_option_rom() ? I caught this bug while deleting a passthrough device from a pseries machine. Here is the stack: #0 qemu_ram_unset_migratable (rb=0x0) at /home/legoater/work/qemu/qemu-xive-3.0.git/exec.c:1994 #1 0x000000010072def0 in vmstate_unregister_ram (mr=0x101796af0, dev=<optimized out>) #2 0x0000000100694e5c in pci_del_option_rom (pdev=0x101796330) #3 pci_qdev_unrealize (dev=<optimized out>, errp=<optimized out>) #4 0x00000001005ff910 in device_set_realized (obj=0x101796330, value=<optimized out>, errp=0x0) #5 0x00000001007a487c in property_set_bool (obj=0x101796330, v=<optimized out>, name=<optimized out>, #6 0x00000001007a7878 in object_property_set (obj=0x101796330, v=0x7fff70033110, #7 0x00000001007aaf1c in object_property_set_qobject (obj=0x101796330, value=<optimized out>, #8 0x00000001007a7b90 in object_property_set_bool (obj=0x101796330, value=<optimized out>, #9 0x00000001005fcdd8 in device_unparent (obj=0x101796330) #10 0x00000001007a6dd0 in object_finalize_child_property (obj=<optimized out>, name=<optimized out>, #11 0x00000001007a50c0 in object_property_del_child (obj=0x10111f800, child=0x101796330, #12 0x0000000100425cc0 in spapr_phb_remove_pci_device_cb (dev=0x101796330) #13 0x0000000100427974 in spapr_drc_release (drc=0x1017e2df0) #14 0x0000000100429098 in spapr_drc_detach (drc=0x1017e2df0) #15 0x00000001004294e0 in drc_isolate_physical (drc=0x1017e2df0) #16 0x000000010042a50c in rtas_set_isolation_state (state=0, idx=<optimized out>) hw/vfio/pci.c | 1 - 1 file changed, 1 deletion(-) diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c index a1577dea7fdb..6cbb8fa0549d 100644 --- a/hw/vfio/pci.c +++ b/hw/vfio/pci.c @@ -990,7 +990,6 @@ static void vfio_pci_size_rom(VFIOPCIDevice *vdev) pci_register_bar(&vdev->pdev, PCI_ROM_SLOT, PCI_BASE_ADDRESS_SPACE_MEMORY, &vdev->pdev.rom); - vdev->pdev.has_rom = true; vdev->rom_read_failed = false; } -- 2.17.1