On 2/7/25 17:54, Peter Xu wrote:
On Thu, Feb 06, 2025 at 03:21:51PM +0100, Eric Auger wrote:
This is a follow-up of Peter's attempt to fix the fact that
vIOMMUs are likely to be reset before the device they protect:
[PATCH 0/4] intel_iommu: Reset vIOMMU after all the rest of devices
https://lore.kernel.org/all/20240117091559.144730-1-pet...@redhat.com/
This is especially observed with virtio devices when a qmp system_reset
command is sent but also with VFIO devices.
This series puts the vIOMMU reset in the 3-phase exit callback.
This scheme was tested successful with virtio-devices and some
VFIO devices. Nevertheless not all the topologies have been
tested yet.
Eric,
It's great to know that we seem to be able to fix everything in such small
changeset!
I would like to double check two things with you here:
- For VFIO's reset hook, looks like we have landed more changes so that
vfio's reset function is now a TYPE_LEGACY_RESET, and it always do the
reset during "hold" phase only (via legacy_reset_hold()). That part
will make sure vIOMMU (if switching to exit()-only reset) will order
properly with VFIO. Is my understanding correct here?
Eric,
We were still seeing DMA errors from VFIO devices :
VFIO_MAP_DMA failed: Bad address
with this series at shutdown (machine or OS) when using an intel_iommu
device. We could see that the VIOMMU was reset and the device DMAs
were still alive. Do you know why now ?
Thanks,
C.
- Is it possible if some PCIe devices that will provide its own
phase.exit(), would it matter on the order of PCIe device's
phase.exit() and vIOMMU's phase.exit() (if vIOMMUs switch to use
exit()-only approach like this one)?
PS: it would be great to attach such information in either cover letter or
commit message. But definitely not a request to repost the patchset, if
Michael would have Message-ID when merge that'll be far enough to help
anyone find this discussion again.
Thanks!