The PCI device MSIx table is cleaned out in hardware after EEH PE reset. However, we still hold the stale MSIx entries in QEMU, which should be cleared accordingly. Otherwise, we will run into another (recursive) EEH error and the PCI devices contained in the PE have to be offlined exceptionally.
The patch clears stale MSIx table before EEH PE reset so that MSIx table could be restored properly after EEH PE reset. Signed-off-by: Gavin Shan <gws...@linux.vnet.ibm.com> --- hw/misc/vfio.c | 24 ++++++++++++++++++++++++ 1 file changed, 24 insertions(+) diff --git a/hw/misc/vfio.c b/hw/misc/vfio.c index 1a3e7eb..3cf7f02 100644 --- a/hw/misc/vfio.c +++ b/hw/misc/vfio.c @@ -4424,6 +4424,8 @@ int vfio_container_ioctl(AddressSpace *as, int32_t groupid, { VFIOGroup *group; VFIOContainer *container; + VFIODevice *vdev; + struct vfio_eeh_pe_op *arg; int ret = -1; group = vfio_get_group(groupid, as); @@ -4442,7 +4444,29 @@ int vfio_container_ioctl(AddressSpace *as, int32_t groupid, switch (req) { case VFIO_CHECK_EXTENSION: case VFIO_IOMMU_SPAPR_TCE_GET_INFO: + break; case VFIO_EEH_PE_OP: + arg = (struct vfio_eeh_pe_op *)param; + switch (arg->op) { + case VFIO_EEH_PE_RESET_HOT: + case VFIO_EEH_PE_RESET_FUNDAMENTAL: + /* + * The MSIx table will be cleaned out by reset. We need + * disable it so that it can be reenabled properly. Also, + * the cached MSIx table should be cleared as it's not + * reflecting the contents in hardware. + */ + QLIST_FOREACH(vdev, &group->device_list, next) { + if (msix_enabled(&vdev->pdev)) { + vfio_disable_msix(vdev); + } + + msix_reset(&vdev->pdev); + } + + break; + } + break; default: /* Return an error on unknown requests */ -- 1.8.3.2