Hi Cédric,
On 2/7/25 6:25 PM, Cédric Le Goater wrote: > On 2/7/25 17:54, Peter Xu wrote: >> On Thu, Feb 06, 2025 at 03:21:51PM +0100, Eric Auger wrote: >>> This is a follow-up of Peter's attempt to fix the fact that >>> vIOMMUs are likely to be reset before the device they protect: >>> >>> [PATCH 0/4] intel_iommu: Reset vIOMMU after all the rest of devices >>> https://lore.kernel.org/all/20240117091559.144730-1-pet...@redhat.com/ >>> >>> This is especially observed with virtio devices when a qmp system_reset >>> command is sent but also with VFIO devices. >>> >>> This series puts the vIOMMU reset in the 3-phase exit callback. >>> >>> This scheme was tested successful with virtio-devices and some >>> VFIO devices. Nevertheless not all the topologies have been >>> tested yet. >> >> Eric, >> >> It's great to know that we seem to be able to fix everything in such >> small >> changeset! >> >> I would like to double check two things with you here: >> >> - For VFIO's reset hook, looks like we have landed more changes so >> that >> vfio's reset function is now a TYPE_LEGACY_RESET, and it always >> do the >> reset during "hold" phase only (via legacy_reset_hold()). That >> part >> will make sure vIOMMU (if switching to exit()-only reset) will >> order >> properly with VFIO. Is my understanding correct here? > > > Eric, > > We were still seeing DMA errors from VFIO devices : > > VFIO_MAP_DMA failed: Bad address > > with this series at shutdown (machine or OS) when using an intel_iommu > device. We could see that the VIOMMU was reset and the device DMAs > were still alive. Do you know why now ? I have started debugging this other case. At first sight this looks like a different problem. First this occurs on a qmp system_powerdown The error messages do not occur on qemu reset but rather as a result of the guest disabling the intel iommu anc curiously when the aliased IOMMU MR (nodma) is re-enabled. I need more time to debug this. Eric > > Thanks, > > C. > > >> >> - Is it possible if some PCIe devices that will provide its own >> phase.exit(), would it matter on the order of PCIe device's >> phase.exit() and vIOMMU's phase.exit() (if vIOMMUs switch to use >> exit()-only approach like this one)? >> >> PS: it would be great to attach such information in either cover >> letter or >> commit message. But definitely not a request to repost the patchset, if >> Michael would have Message-ID when merge that'll be far enough to help >> anyone find this discussion again. >> >> Thanks! >> >