On 5/24/2021 2:29 PM, Steven Sistare wrote: > On 5/21/2021 6:24 PM, Alex Williamson wrote:> On Fri, 7 May 2021 05:25:10 > -0700 >> Steve Sistare <steven.sist...@oracle.com> wrote: >> >>>[...] >>> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c >>> index 7a4fb6c..f7ac9f03 100644 >>> --- a/hw/vfio/pci.c >>> +++ b/hw/vfio/pci.c >>> @@ -29,6 +29,8 @@ >>> #include "hw/qdev-properties.h" >>> #include "hw/qdev-properties-system.h" >>> #include "migration/vmstate.h" >>> +#include "migration/cpr.h" >>> +#include "qemu/env.h" >>> #include "qemu/error-report.h" >>> #include "qemu/main-loop.h" >>> #include "qemu/module.h" >>> @@ -1612,6 +1614,14 @@ static void vfio_mmap_set_enabled(VFIOPCIDevice >>> *vdev, bool enabled) >>> } >>> } >>> >>> +static void vfio_config_sync(VFIOPCIDevice *vdev, uint32_t offset, size_t >>> len) >>> +{ >>> + if (pread(vdev->vbasedev.fd, vdev->pdev.config + offset, len, >>> + vdev->config_offset + offset) != len) { >>> + error_report("vfio_config_sync pread failed"); >>> + } >>> +} >>> + >>> static void vfio_bar_prepare(VFIOPCIDevice *vdev, int nr) >>> { >>> VFIOBAR *bar = &vdev->bars[nr]; >>> @@ -1652,6 +1662,7 @@ static void vfio_bars_prepare(VFIOPCIDevice *vdev) >>> static void vfio_bar_register(VFIOPCIDevice *vdev, int nr) >>> { >>> VFIOBAR *bar = &vdev->bars[nr]; >>> + PCIDevice *pdev = &vdev->pdev; >>> char *name; >>> >>> if (!bar->size) { >>> @@ -1672,7 +1683,10 @@ static void vfio_bar_register(VFIOPCIDevice *vdev, >>> int nr) >>> } >>> } >>> >>> - pci_register_bar(&vdev->pdev, nr, bar->type, bar->mr); >>> + pci_register_bar(pdev, nr, bar->type, bar->mr); >>> + if (pdev->reused) { >>> + vfio_config_sync(vdev, pci_bar(pdev, nr), 8); >> >> Assuming 64-bit BARs? This might be the first case where we actually >> rely on the kernel BAR values, IIRC we usually use QEMU's emulation. > > No asssumptions. vfio_config_sync() preads a piece of config space using a > single > system call, copying directly to the qemu buffer, not looking at words or > calling any > action functions. > >[...] >>> @@ -3098,6 +3115,11 @@ static void vfio_realize(PCIDevice *pdev, Error >>> **errp) >>> vfio_register_req_notifier(vdev); >>> vfio_setup_resetfn_quirk(vdev); >>> >>> + vfio_config_sync(vdev, pdev->msix_cap + PCI_MSIX_FLAGS, 2); >>> + if (pdev->reused) { >>> + pci_update_mappings(pdev); >>> + } >>> + >> >> Are the msix flag sync and mapping update related? They seem >> independent to me. A blank line and comment would be helpful. > > OK. > >> I expect we'd need to call msix_enabled() somewhere for the msix flag >> sync to be effective. > > Yes, vfio_pci_post_load in cpr part 2 calls msix_enabled. > >> Is there an assumption here of msi-x only support or is it not needed >> for msi or intx? > > The code supports msi-x and msi. However, I should only be sync'ing > PCI_MSIX_FLAGS > if pdev->cap_present & QEMU_PCI_CAP_MSIX. And, I am missing a sync for > PCI_MSI_FLAGS. > I'll fix that.
Hi Alex, FYI, I am making more changes here. The calls to vfio_config_sync fix pdev->config[] words that are initialized during vfio_realize(), by pread'ing from the live kernel config. However, it makes more sense to suppress the undesired re-initialization, rather than undo the damage later. Thus I will add a few more 'if (!pdev->reused)' guards in msix and pci bar init functions, and delete vfio_config_sync. Most of the config is preserved in the kernel across restart. However, the bits that are purely emulated (indicated by the emulated_config_bits mask) may be rejected when they are written through to the kernel, and thus are currently lost on restart. I need to save pdev->config[] in the vmstate file, and in vfio_pci_post_load, merge it with the kernel config using emulated_config_bits. Sound sane? - Steve