On 6/1/2025 3:07 PM, Michael S. Tsirkin wrote:
On Sun, Jun 01, 2025 at 06:38:43PM +0200, Cédric Le Goater wrote:
On 5/29/25 21:24, Steve Sistare wrote:
Do not reset a vfio-pci device during CPR.
Signed-off-by: Steve Sistare <steven.sist...@oracle.com>
---
include/hw/pci/pci_device.h | 3 +++
hw/pci/pci.c | 5 +++++
hw/vfio/pci.c | 7 +++++++
3 files changed, 15 insertions(+)
diff --git a/include/hw/pci/pci_device.h b/include/hw/pci/pci_device.h
index e41d95b..b481c5d 100644
--- a/include/hw/pci/pci_device.h
+++ b/include/hw/pci/pci_device.h
@@ -181,6 +181,9 @@ struct PCIDevice {
uint32_t max_bounce_buffer_size;
char *sriov_pf;
+
+ /* CPR */
+ bool skip_reset_on_cpr;
};
static inline int pci_intx(PCIDevice *pci_dev)
diff --git a/hw/pci/pci.c b/hw/pci/pci.c
index f5ab510..21eb11c 100644
--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -32,6 +32,7 @@
#include "hw/pci/pci_host.h"
#include "hw/qdev-properties.h"
#include "hw/qdev-properties-system.h"
+#include "migration/cpr.h"
#include "migration/qemu-file-types.h"
#include "migration/vmstate.h"
#include "net/net.h"
@@ -531,6 +532,10 @@ static void pci_reset_regions(PCIDevice *dev)
static void pci_do_device_reset(PCIDevice *dev)
{
+ if (dev->skip_reset_on_cpr && cpr_is_incoming()) {
+ return;
+ }
Since ->skip_reset_on_cpr is only true for vfio-pci devices, it could be
replaced by : object_dynamic_cast(OBJECT(dev), "vfio-pci")
Thanks,
C.
True but I don't really like driver dependent hacks.
what exactly about vfio makes it survive without this reset?
The kernel descriptors remain open and all the active kernel PCI state
remains in place. The device was never quiesced or de-configured in old QEMU.
The cast is fine with me; it depends on what Michael wants.
- Steve
pci_device_deassert_intx(dev);
assert(dev->irq_state == 0);
diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index 7d3b9ff..56e7fdd 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -3402,6 +3402,13 @@ static void vfio_instance_init(Object *obj)
/* QEMU_PCI_CAP_EXPRESS initialization does not depend on QEMU command
* line, therefore, no need to wait to realize like other devices */
pci_dev->cap_present |= QEMU_PCI_CAP_EXPRESS;
+
+ /*
+ * A device that is resuming for cpr is already configured, so do not
+ * reset it during qemu_system_reset prior to cpr load, else interrupts
+ * may be lost.
+ */
+ pci_dev->skip_reset_on_cpr = true;
}> static void vfio_pci_base_dev_class_init(ObjectClass *klass,
const void *data)