On Thu, May 11, 2023 at 10:27:35AM +0200, Fiona Ebner wrote: > Am 03.05.23 um 02:27 schrieb Leonardo Bras: > > Since it's implementation on v8.0.0-rc0, having the PCI_ERR_UNCOR_MASK > > set for machine types < 8.0 will cause migration to fail if the target > > QEMU version is < 8.0.0 : > > > > qemu-system-x86_64: get_pci_config_device: Bad config data: i=0x10a read: > > 40 device: 0 cmask: ff wmask: 0 w1cmask:0 > > qemu-system-x86_64: Failed to load PCIDevice:config > > qemu-system-x86_64: Failed to load e1000e:parent_obj > > qemu-system-x86_64: error while loading state for instance 0x0 of device > > '0000:00:02.0/e1000e' > > qemu-system-x86_64: load of migration failed: Invalid argument > > > > The above test migrated a 7.2 machine type from QEMU master to QEMU 7.2.0, > > with this cmdline: > > > > ./qemu-system-x86_64 -M pc-q35-7.2 [-incoming XXX] > > > > In order to fix this, property x-pcie-err-unc-mask was introduced to > > control when PCI_ERR_UNCOR_MASK is enabled. This property is enabled by > > default, but is disabled if machine type <= 7.2. > > > > Fixes: 010746ae1d ("hw/pci/aer: Implement PCI_ERR_UNCOR_MASK register") > > Suggested-by: Michael S. Tsirkin <m...@redhat.com> > > Signed-off-by: Leonardo Bras <leob...@redhat.com> > > Thank you for the patch! > > Closes: https://gitlab.com/qemu-project/qemu/-/issues/1576 > > AFAICT, this breaks (forward) migration from 8.0 to 8.0 + this patch > when using machine type <= 7.2. That is because after this patch, when > using machine type <= 7.2, the wmask for the register is not set and > when 8.0 sends a nonzero value for the register, the error condition in > get_pci_config_device() will trigger again. > > Is it necessary to also handle that? Maybe by special casing the error > condition in get_pci_config_device() to be prepared to accept such a > stream from 8.0? > > If that is considered not worth it, consider this: > > Tested-by: Fiona Ebner <f.eb...@proxmox.com> > > Best Regards, > Fiona
Yes any fix is like that. We keep encountering bugs like this but there does not seem to be will to create infrastructure for fixing it, which would involve describing version of qemu being migrated to. -- MST