On Wed, 2015-03-11 at 17:11 +1100, Gavin Shan wrote: > When Linux guest recovers from EEH error on the following Emulex > adapter, the MSIx interrupts are disabled and the INTx emulation > is enabled. One INTx interrupt is injected to the guest by host > because of detected pending INTx interrupts on the adapter. QEMU > disables mmap'ed BAR regions and starts a timer to enable those > regions at later point the INTx interrupt handler. Unfortunately, > "VFIOPCIDevice->intx.pending" isn't cleared, meaning those disabled > mapp'ed BAR regions won't be reenabled properly. It leads to EEH > recovery failure at guest side because of hanged MMIO access. > > # lspci | grep Emulex > 0000:01:00.0 Ethernet controller: Emulex Corporation \ > OneConnect 10Gb NIC (be3) (rev 02) > 0000:01:00.1 Ethernet controller: Emulex Corporation \ > OneConnect 10Gb NIC (be3) (rev 02) > > The patch clears "VFIOPCIDevice->intx.pending" after EEH reset > is completed on the PE, which contains the adapter. In turn, the > mmap'ed BAR regions can be reenabled to avoid EEH recovery failure. > > Signed-off-by: Gavin Shan <gws...@linux.vnet.ibm.com> > --- > hw/vfio/pci.c | 14 ++++++++++++++ > 1 file changed, 14 insertions(+) > > diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c > index 8c4a8cb..55e0904 100644 > --- a/hw/vfio/pci.c > +++ b/hw/vfio/pci.c > @@ -3352,6 +3352,20 @@ int vfio_container_eeh_event(AddressSpace *as, int32_t > groupid, > } > > break; > + case VFIO_EEH_PE_RESET_DEACTIVATE: > + /* > + * We might have INTx interrupt whose handler disabled the > + * memory mapped BARs. Without clearing the INTx pending > + * state, the timer kicked by the INTx interrupt handler > + * won't enable those disabled memory mapped BARs, which > + * leads EEH recovery failure. > + */ > + QLIST_FOREACH(vbasedev, &group->device_list, next) { > + vdev = container_of(vbasedev, VFIOPCIDevice, vbasedev); > + vdev->intx.pending = false; > + }
I'm nervous that "pending" is trying to track that a) the host interrupt is masked and b) the emulated INTx line for the device is asserted, but we're not clearing the state of any of that here. We can handle a spurious EOI, the device should simply re-assert the interrupt, but changing one piece of tracking w/o getting everything in sync seems like a looming bug. Thanks, Alex > + > + break; > } > > vfio_put_group(group);