Dear Laurent, Thank you for your quick reply. We used qemu-7.1, but it is reproducible with qemu from v6.2 to the recent v8.0 release candidates. I found that it's introduced by the commit 9323f892b39 (between v6.2.0-rc2 and v6.2.0-rc3).
If it doesn't break anything else, it suffices to remove the line below from acpi_pcihp_device_unplug_request_cb(): pdev->qdev.pending_deleted_event = true; but you may have a reason to keep it. First of all, I'll open a bug in the bug tracker and let you know. Best regards, Yu Zhang On Mon, Apr 3, 2023 at 6:32 PM Laurent Vivier <lviv...@redhat.com> wrote: > Hi Yu, > > please open a bug in the bug tracker: > > https://gitlab.com/qemu/qemu/-/issues > > It's easier to track the problem. > > What is the version of QEMU you are using? > Could you provide QEMU command line? > > Thanks, > Laurent > > > On 4/3/23 15:24, Yu Zhang wrote: > > Dear Laurent, > > > > recently we run into an issue with the following error: > > > > command '{ "execute": "device_del", "arguments": { "id": "virtio-diskX" > } }' for VM "id" > > failed ({ "return": {"class": "GenericError", "desc": "Device > virtio-diskX is already in > > the process of unplug"} }). > > > > The issue is reproducible. With a few seconds delay before hot-unplug, > hot-unplug just > > works fine. > > > > After a few digging, we found that the commit 9323f892b39 may incur the > issue. > > ------------------ > > failover: fix unplug pending detection > > > > Failover needs to detect the end of the PCI unplug to start > migration > > after the VFIO card has been unplugged. > > > > To do that, a flag is set in pcie_cap_slot_unplug_request_cb() and > reset in > > pcie_unplug_device(). > > > > But since > > 17858a169508 ("hw/acpi/ich9: Set ACPI PCI hot-plug as default > on Q35") > > we have switched to ACPI unplug and these functions are not called > anymore > > and the flag not set. So failover migration is not able to detect > if card > > is really unplugged and acts as it's done as soon as it's started. > So it > > doesn't wait the end of the unplug to start the migration. We don't > see any > > problem when we test that because ACPI unplug is faster than PCIe > native > > hotplug and when the migration really starts the unplug operation is > > already done. > > > > See c000a9bd06ea ("pci: mark device having guest unplug request > pending") > > a99c4da9fc2a ("pci: mark devices partially unplugged") > > > > Signed-off-by: Laurent Vivier <lviv...@redhat.com <mailto: > lviv...@redhat.com>> > > Reviewed-by: Ani Sinha <a...@anisinha.ca <mailto:a...@anisinha.ca>> > > Message-Id: <20211118133225.324937-4-lviv...@redhat.com > > <mailto:20211118133225.324937-4-lviv...@redhat.com>> > > Reviewed-by: Michael S. Tsirkin <m...@redhat.com <mailto: > m...@redhat.com>> > > Signed-off-by: Michael S. Tsirkin <m...@redhat.com <mailto: > m...@redhat.com>> > > ------------------ > > The purpose is for detecting the end of the PCI device hot-unplug. > However, we feel the > > error confusing. How is it possible that a disk "is already in the > process of unplug" > > during the first hot-unplug attempt? So far as I know, the issue was > also encountered by > > libvirt, but they simply ignored it: > > > > https://bugzilla.redhat.com/show_bug.cgi?id=1878659 > > <https://bugzilla.redhat.com/show_bug.cgi?id=1878659> > > > > Hence, a question is: should we have the line below in > acpi_pcihp_device_unplug_request_cb()? > > > > pdev->qdev.pending_deleted_event = true; > > > > It would be great if you as the author could give us a few hints. > > > > Thank you very much for your reply! > > > > Sincerely, > > > > Yu Zhang @ Compute Platform IONOS > > 03.04.2013 > >