Hi,

On 2/7/25 6:31 PM, Peter Xu wrote:
> On Fri, Feb 07, 2025 at 05:06:20PM +0000, Peter Maydell wrote:
>> On Fri, 7 Feb 2025 at 16:54, Peter Xu <pet...@redhat.com> wrote:
>>> On Thu, Feb 06, 2025 at 03:21:51PM +0100, Eric Auger wrote:
>>>> This is a follow-up of Peter's attempt to fix the fact that
>>>> vIOMMUs are likely to be reset before the device they protect:
>>>>
>>>> [PATCH 0/4] intel_iommu: Reset vIOMMU after all the rest of devices
>>>> https://lore.kernel.org/all/20240117091559.144730-1-pet...@redhat.com/
>>>>
>>>> This is especially observed with virtio devices when a qmp system_reset
>>>> command is sent but also with VFIO devices.
>>>>
>>>> This series puts the vIOMMU reset in the 3-phase exit callback.
>>>>
>>>> This scheme was tested successful with virtio-devices and some
>>>> VFIO devices. Nevertheless not all the topologies have been
>>>> tested yet.
>>> Eric,
>>>
>>> It's great to know that we seem to be able to fix everything in such small
>>> changeset!
>>>
>>> I would like to double check two things with you here:
>>>
>>>   - For VFIO's reset hook, looks like we have landed more changes so that
>>>     vfio's reset function is now a TYPE_LEGACY_RESET, and it always do the
>>>     reset during "hold" phase only (via legacy_reset_hold()).  That part
>>>     will make sure vIOMMU (if switching to exit()-only reset) will order
>>>     properly with VFIO.  Is my understanding correct here?
>> Yes, we now do a reset of the whole system as a three-phase setup,
>> and the old pre-three-phase reset APIs like qemu_register_reset() and
>> device_class_set_legacy_reset() all happen during the "hold" phase.
>>
>>>   - Is it possible if some PCIe devices that will provide its own
>>>     phase.exit(), would it matter on the order of PCIe device's
>>>     phase.exit() and vIOMMU's phase.exit() (if vIOMMUs switch to use
>>>     exit()-only approach like this one)?
>> It's certainly possible for a PCIe device to implement
>> a three-phase reset which does things in the exit phase. However
>> I think I would say that such a device which didn't cancel all
>> outstanding DMA operations during either 'enter' or 'hold'
>> phases would be broken. If it did some other things during
>> the 'exit' phase I don't think the ordering of those vs the
>> iommu 'exit' handling should matter.
> Yes, this sounds fair.
>
>> (To some extent the splitting into three phases is trying
>> to set up a consistent model as outlined in docs/devel/reset.rst
>> and to some extent it's just a convenient way to get a basic
>> "this reset thing I need to do must happen after some other
>> device has done its reset things" which you can achieve
>> by ad-hoc putting them in different phases. Ideally we get
>> mostly the former and a little pragmatic dose of the latter,
>> but the consistent model is not very solidly nailed down
>> so I have a feeling the proportions may not be quite as
>> lopsided as we'd like :-) )
> Yes, it's a good move that we can have other ways to fix all the problems
> without major surgery, and it also looks solid and clean if we have plan to
> fix any outlier PCIe devices.
>
> If there will be a repost after all, not sure if Eric would like to add
> some of above discussions into either some commit messages or cover letter.
> Or some comment in the code might be even better.
Yes I will definitively augment commit msgs/cover letter with all those
considerations. Thank you very much for this rich discussion!

Eric
>
> Thanks!
>


Reply via email to