Hi Cédric,

On 2/7/25 6:25 PM, Cédric Le Goater wrote:
> On 2/7/25 17:54, Peter Xu wrote:
>> On Thu, Feb 06, 2025 at 03:21:51PM +0100, Eric Auger wrote:
>>> This is a follow-up of Peter's attempt to fix the fact that
>>> vIOMMUs are likely to be reset before the device they protect:
>>>
>>> [PATCH 0/4] intel_iommu: Reset vIOMMU after all the rest of devices
>>> https://lore.kernel.org/all/20240117091559.144730-1-pet...@redhat.com/
>>>
>>> This is especially observed with virtio devices when a qmp system_reset
>>> command is sent but also with VFIO devices.
>>>
>>> This series puts the vIOMMU reset in the 3-phase exit callback.
>>>
>>> This scheme was tested successful with virtio-devices and some
>>> VFIO devices. Nevertheless not all the topologies have been
>>> tested yet.
>>
>> Eric,
>>
>> It's great to know that we seem to be able to fix everything in such
>> small
>> changeset!
>>
>> I would like to double check two things with you here:
>>
>>    - For VFIO's reset hook, looks like we have landed more changes so
>> that
>>      vfio's reset function is now a TYPE_LEGACY_RESET, and it always
>> do the
>>      reset during "hold" phase only (via legacy_reset_hold()).  That
>> part
>>      will make sure vIOMMU (if switching to exit()-only reset) will
>> order
>>      properly with VFIO.  Is my understanding correct here?
>
>
> Eric,
>
> We were still seeing DMA errors from VFIO devices :
>
>   VFIO_MAP_DMA failed: Bad address
>
> with this series at shutdown (machine or OS) when using an intel_iommu
> device. We could see that the VIOMMU was reset and the device DMAs
> were still alive. Do you know why now ?

I have started debugging this other case. At first sight this looks like
a different problem. First this occurs on a qmp system_powerdown
The error messages do not occur on qemu reset but rather as a result of
the guest disabling the intel iommu anc curiously when the aliased IOMMU
MR (nodma) is re-enabled. I need more time to debug this.

Eric

>
> Thanks,
>
> C.
>
>
>>
>>    - Is it possible if some PCIe devices that will provide its own
>>      phase.exit(), would it matter on the order of PCIe device's
>>      phase.exit() and vIOMMU's phase.exit() (if vIOMMUs switch to use
>>      exit()-only approach like this one)?
>>
>> PS: it would be great to attach such information in either cover
>> letter or
>> commit message.  But definitely not a request to repost the patchset, if
>> Michael would have Message-ID when merge that'll be far enough to help
>> anyone find this discussion again.
>>
>> Thanks!
>>
>


Reply via email to