> From: Dr. David Alan Gilbert [mailto:dgilb...@redhat.com] > Sent: Friday, August 23, 2019 5:27 PM > > * Tian, Kevin (kevin.t...@intel.com) wrote: > > > From: Dr. David Alan Gilbert [mailto:dgilb...@redhat.com] > > > Sent: Friday, August 23, 2019 3:13 AM > > > > > > * Kirti Wankhede (kwankh...@nvidia.com) wrote: > > > > > > > > > > > > On 8/22/2019 3:02 PM, Dr. David Alan Gilbert wrote: > > > > > * Kirti Wankhede (kwankh...@nvidia.com) wrote: > > > > >> Sorry for delay to respond. > > > > >> > > > > >> On 7/11/2019 5:37 PM, Dr. David Alan Gilbert wrote: > > > > >>> * Kirti Wankhede (kwankh...@nvidia.com) wrote: > > > > >>>> These functions save and restore PCI device specific data - config > > > > >>>> space of PCI device. > > > > >>>> Tested save and restore with MSI and MSIX type. > > > > >>>> > > > > >>>> Signed-off-by: Kirti Wankhede <kwankh...@nvidia.com> > > > > >>>> Reviewed-by: Neo Jia <c...@nvidia.com> > > > > >>>> --- > > > > >>>> hw/vfio/pci.c | 114 > > > ++++++++++++++++++++++++++++++++++++++++++ > > > > >>>> include/hw/vfio/vfio-common.h | 2 + > > > > >>>> 2 files changed, 116 insertions(+) > > > > >>>> > > > > >>>> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c > > > > >>>> index de0d286fc9dd..5fe4f8076cac 100644 > > > > >>>> --- a/hw/vfio/pci.c > > > > >>>> +++ b/hw/vfio/pci.c > > > > >>>> @@ -2395,11 +2395,125 @@ static Object > > > *vfio_pci_get_object(VFIODevice *vbasedev) > > > > >>>> return OBJECT(vdev); > > > > >>>> } > > > > >>>> > > > > >>>> +static void vfio_pci_save_config(VFIODevice *vbasedev, > QEMUFile *f) > > > > >>>> +{ > > > > >>>> + VFIOPCIDevice *vdev = container_of(vbasedev, VFIOPCIDevice, > > > vbasedev); > > > > >>>> + PCIDevice *pdev = &vdev->pdev; > > > > >>>> + uint16_t pci_cmd; > > > > >>>> + int i; > > > > >>>> + > > > > >>>> + for (i = 0; i < PCI_ROM_SLOT; i++) { > > > > >>>> + uint32_t bar; > > > > >>>> + > > > > >>>> + bar = pci_default_read_config(pdev, PCI_BASE_ADDRESS_0 + > i * > > > 4, 4); > > > > >>>> + qemu_put_be32(f, bar); > > > > >>>> + } > > > > >>>> + > > > > >>>> + qemu_put_be32(f, vdev->interrupt); > > > > >>>> + if (vdev->interrupt == VFIO_INT_MSI) { > > > > >>>> + uint32_t msi_flags, msi_addr_lo, msi_addr_hi = 0, > > > > >>>> msi_data; > > > > >>>> + bool msi_64bit; > > > > >>>> + > > > > >>>> + msi_flags = pci_default_read_config(pdev, pdev->msi_cap + > > > PCI_MSI_FLAGS, > > > > >>>> + 2); > > > > >>>> + msi_64bit = (msi_flags & PCI_MSI_FLAGS_64BIT); > > > > >>>> + > > > > >>>> + msi_addr_lo = pci_default_read_config(pdev, > > > > >>>> + pdev->msi_cap + > > > > >>>> PCI_MSI_ADDRESS_LO, 4); > > > > >>>> + qemu_put_be32(f, msi_addr_lo); > > > > >>>> + > > > > >>>> + if (msi_64bit) { > > > > >>>> + msi_addr_hi = pci_default_read_config(pdev, > > > > >>>> + pdev->msi_cap + > > > > >>>> PCI_MSI_ADDRESS_HI, > > > > >>>> + 4); > > > > >>>> + } > > > > >>>> + qemu_put_be32(f, msi_addr_hi); > > > > >>>> + > > > > >>>> + msi_data = pci_default_read_config(pdev, > > > > >>>> + pdev->msi_cap + (msi_64bit ? PCI_MSI_DATA_64 : > > > PCI_MSI_DATA_32), > > > > >>>> + 2); > > > > >>>> + qemu_put_be32(f, msi_data); > > > > >>>> + } else if (vdev->interrupt == VFIO_INT_MSIX) { > > > > >>>> + uint16_t offset; > > > > >>>> + > > > > >>>> + /* save enable bit and maskall bit */ > > > > >>>> + offset = pci_default_read_config(pdev, > > > > >>>> + pdev->msix_cap + > > > > >>>> PCI_MSIX_FLAGS + 1, 2); > > > > >>>> + qemu_put_be16(f, offset); > > > > >>>> + msix_save(pdev, f); > > > > >>>> + } > > > > >>>> + pci_cmd = pci_default_read_config(pdev, PCI_COMMAND, 2); > > > > >>>> + qemu_put_be16(f, pci_cmd); > > > > >>>> +} > > > > >>>> + > > > > >>>> +static void vfio_pci_load_config(VFIODevice *vbasedev, > QEMUFile *f) > > > > >>>> +{ > > > > >>>> + VFIOPCIDevice *vdev = container_of(vbasedev, VFIOPCIDevice, > > > vbasedev); > > > > >>>> + PCIDevice *pdev = &vdev->pdev; > > > > >>>> + uint32_t interrupt_type; > > > > >>>> + uint32_t msi_flags, msi_addr_lo, msi_addr_hi = 0, msi_data; > > > > >>>> + uint16_t pci_cmd; > > > > >>>> + bool msi_64bit; > > > > >>>> + int i; > > > > >>>> + > > > > >>>> + /* retore pci bar configuration */ > > > > >>>> + pci_cmd = pci_default_read_config(pdev, PCI_COMMAND, 2); > > > > >>>> + vfio_pci_write_config(pdev, PCI_COMMAND, > > > > >>>> + pci_cmd & (!(PCI_COMMAND_IO | > > > PCI_COMMAND_MEMORY)), 2); > > > > >>>> + for (i = 0; i < PCI_ROM_SLOT; i++) { > > > > >>>> + uint32_t bar = qemu_get_be32(f); > > > > >>>> + > > > > >>>> + vfio_pci_write_config(pdev, PCI_BASE_ADDRESS_0 + i * 4, > > > > >>>> bar, > 4); > > > > >>>> + } > > > > >>> > > > > >>> Is it possible to validate the bar's at all? We just had a bug on a > > > > >>> virtual device where one version was asking for a larger bar than > > > > >>> the > > > > >>> other; our validation caught this in some cases so we could tell > > > > >>> that > > > > >>> the guest had a BAR that was aligned at the wrong alignment. > > > > I'm a bit confused here. Did you mean that src and dest include > > different versions of the virtual device which implements different > > BAR size? If that is the case, shouldn't the migration fail at the start > > when doing compatibility check? > > It was a mistake where the destination had accidentally changed the BAR > size; checking the alignment was the only check that failed. > > > > > >>> > > > > >> > > > > >> "Validate the bars" does that means validate size of bars? > > > > > > > > > > I meant validate the address programmed into the BAR against the size, > > > > > assuming you know the size; e.g. if it's a 128MB BAR, then make sure > the > > > > > address programmed in is 128MB aligned. > > > > > > > > > > > > > If this validation fails, migration resume should fail, right? > > > > > > Yes I think so; if you've got a device that wants 128MB alignment and > > > someone gives you a non-aligned address, who knows what will happen. > > > > If misalignment is really caused by the guest, shouldn't we just follow > > the hardware behavior, i.e. hard-wiring the lower bits to 0 before > > updating the cfg space? > > That should already happen on the source; but when loading a migration > stream I try and be very untrusting; so it's good to check that the > destination devices idea of the BAR matches what the register has. >
OK, it makes sense. Thanks Kevin