On 03/25/2015 10:41 AM, Alex Williamson wrote:
On Wed, 2015-03-25 at 09:53 +0800, Chen Fan wrote:
On 03/16/2015 10:09 PM, Alex Williamson wrote:
On Mon, 2015-03-16 at 15:35 +0800, Chen Fan wrote:
On 03/16/2015 11:52 AM, Alex Williamson wrote:
On Mon, 2015-03-16 at 11:05 +0800, Chen Fan wrote:
On 03/14/2015 06:34 AM, Alex Williamson wrote:
On Thu, 2015-03-12 at 18:23 +0800, Chen Fan wrote:
when the vfio device encounters an uncorrectable error in host,
the vfio_pci driver will signal the eventfd registered by this
vfio device, the results in the qemu eventfd handler getting
invoked.
this patch is to pass the error to guest and have the guest driver
recover from the error.
What is going to be the typical recovery mechanism for the guest? I'm
concerned that the topology of the device in the guest doesn't
necessarily match the topology of the device in the host, so if the
guest were to attempt a bus reset to recover a device, for instance,
what happens?
the recovery mechanism is that when guest got an aer error from a device,
guest will clean the corresponding status bit in device register. and for
need reset device, the guest aer driver would reset all devices under bus.
Sorry, I'm still confused, how does the guest aer driver reset all
devices under a bus? Are we talking about function-level, device
specific reset mechanisms or secondary bus resets? If the guest is
performing secondary bus resets, what guarantee do they have that it
will translate to a physical secondary bus reset? vfio may only do an
FLR when the bus is reset or it may not be able to do anything depending
on the available function-level resets and physical and virtual topology
of the device. Thanks,
in general, functions depends on the corresponding device driver behaviors
to do the recovery. e.g: implemented the error_detect, slot_reset callbacks.
and for link reset, it usually do secondary bus reset.
and do we must require to the physical secondary bus reset for vfio device
as bus reset?
That depends on how the guest driver attempts recovery, doesn't it?
There are only a very limited number of cases where a secondary bus
reset initiated by the guest will translate to a secondary bus reset of
the physical device (iirc, single function device without FLR). In most
cases, it will at best be translated to an FLR. VFIO really only does
bus resets on VM reset because that's the only time we know that it's ok
to reset multiple devices. If the guest driver is depending on a
secondary bus reset to put the device into a recoverable state and we're
not able to provide that, then we're actually reducing containment of
the error by exposing AER to the guest and allowing it to attempt
recovery. So in practice, I'm afraid we're risking the integrity of the
VM by exposing AER to the guest and making it think that it can perform
recovery operations that are not effective. Thanks,
I also have seen that if device without FLR, it seems can do hot reset
by ioctl VFIO_DEVICE_PCI_HOT_RESET to reset the physical slot or bus
in vfio_pci_reset. does it satisfy the recovery issues that you said?
The hot reset interface can only be used when a) the user (QEMU) owns
all of the devices on the bus and b) we know we're resetting all of the
devices. That mostly limits its use to VM reset. I think that on a
secondary bus reset, we don't know the scope of the reset at the QEMU
vfio driver, so we only make use of reset methods with a function-level
scope. That would only result in a secondary bus reset if that's the
reset mechanism used by the host kernel's PCI code (pci_reset_function),
which is limited to single function devices on a secondary bus, with no
other reset mechanisms. The host reset is also only available in some
configurations, for instance if we have a dual-port NIC where each
function is a separate IOMMU group, then we clearly cannot do a hot
reset unless both functions are assigned to the same VM _and_ appear to
the guest on the same virtual bus. So even if we could know the scope
of the reset in the QEMU vfio driver, we can only make use of it under
very strict guest configurations. Thanks,
Hi Alex,
have you some idea or scenario to fix/escape this issue?
Thanks,
Chen
Alex
.