> From: Alex Williamson [mailto:alex.william...@redhat.com] > Sent: Saturday, January 21, 2017 2:21 AM > > On Fri, 20 Jan 2017 06:57:22 +0000 > "Tian, Kevin" <kevin.t...@intel.com> wrote: > > > > From: Alex Williamson > > > Sent: Thursday, January 19, 2017 6:32 AM > > > > > > On Sat, 31 Dec 2016 17:13:07 +0800 > > > Cao jin <caoj.f...@cn.fujitsu.com> wrote: > > > > > > > From: Chen Fan <chen.fan.f...@cn.fujitsu.com> > > > > > > > > When physical device has uncorrectable error hanppened, the vfio_pci > > > > driver will signal the uncorrectable error status register value to > > > > corresponding QEMU's vfio-pci device via the eventfd registered by this > > > > device, then, the vfio-pci's error eventfd handler will be invoked in > > > > event loop. > > > > > > > > Construct and pass the aer message to root port, root port will trigger > > > > an > > > > interrupt to signal guest, then, the guest driver will do the recovery. > > > > > > > > Note: Now only support non-fatal error's recovery, fatal error will > > > > still result in vm stop. > > > > > > Please update the entire commit log, don't just add a note that this > > > now only covers non-fatal errors. > > > > > > > One thing relate to vIOMMU. There is still a TODO task about forwarding > > IOMMU fault thru VFIO to Qemu, so Qemu vIOMMU has the chance to > > walk guest remapping structure to emulate virtual IOMMU fault. Likely > > it also requires eventfd mechanism. > > > > Wondering whether making sense to reuse same eventfd for both AER > > and vIOMMU or using separate eventfd is also fine? Even go with the > > former option, I don't expect substantial change to this series. Major > > change is on interface definition - extensible to multiple types of > > fault/error conditions instead of assuming AER only. > > > > Thoughts? > > We can't really convey any information through an eventfd, it's just a > signal, so I don't think we can use the same eventfd for both types of > errors. Already here we're discussing the idea of using separate > eventfds for fatal vs non-fatal AERs. IOMMU error processing seems > like yet another eventfd and likely some region or ioctl mechanism for > retrieving the error details since the IOMMU hardware is not directly > accessible. Furthermore, such an event might logically be connected to > the vfio container rather than the device, so it might not even use the > same file descriptor. Thanks, >
Clear enough. Thanks, Kevin