On Fri, Feb 07, 2025 at 05:06:20PM +0000, Peter Maydell wrote: > On Fri, 7 Feb 2025 at 16:54, Peter Xu <pet...@redhat.com> wrote: > > > > On Thu, Feb 06, 2025 at 03:21:51PM +0100, Eric Auger wrote: > > > This is a follow-up of Peter's attempt to fix the fact that > > > vIOMMUs are likely to be reset before the device they protect: > > > > > > [PATCH 0/4] intel_iommu: Reset vIOMMU after all the rest of devices > > > https://lore.kernel.org/all/20240117091559.144730-1-pet...@redhat.com/ > > > > > > This is especially observed with virtio devices when a qmp system_reset > > > command is sent but also with VFIO devices. > > > > > > This series puts the vIOMMU reset in the 3-phase exit callback. > > > > > > This scheme was tested successful with virtio-devices and some > > > VFIO devices. Nevertheless not all the topologies have been > > > tested yet. > > > > Eric, > > > > It's great to know that we seem to be able to fix everything in such small > > changeset! > > > > I would like to double check two things with you here: > > > > - For VFIO's reset hook, looks like we have landed more changes so that > > vfio's reset function is now a TYPE_LEGACY_RESET, and it always do the > > reset during "hold" phase only (via legacy_reset_hold()). That part > > will make sure vIOMMU (if switching to exit()-only reset) will order > > properly with VFIO. Is my understanding correct here? > > Yes, we now do a reset of the whole system as a three-phase setup, > and the old pre-three-phase reset APIs like qemu_register_reset() and > device_class_set_legacy_reset() all happen during the "hold" phase. > > > - Is it possible if some PCIe devices that will provide its own > > phase.exit(), would it matter on the order of PCIe device's > > phase.exit() and vIOMMU's phase.exit() (if vIOMMUs switch to use > > exit()-only approach like this one)? > > It's certainly possible for a PCIe device to implement > a three-phase reset which does things in the exit phase. However > I think I would say that such a device which didn't cancel all > outstanding DMA operations during either 'enter' or 'hold' > phases would be broken. If it did some other things during > the 'exit' phase I don't think the ordering of those vs the > iommu 'exit' handling should matter.
Yes, this sounds fair. > > (To some extent the splitting into three phases is trying > to set up a consistent model as outlined in docs/devel/reset.rst > and to some extent it's just a convenient way to get a basic > "this reset thing I need to do must happen after some other > device has done its reset things" which you can achieve > by ad-hoc putting them in different phases. Ideally we get > mostly the former and a little pragmatic dose of the latter, > but the consistent model is not very solidly nailed down > so I have a feeling the proportions may not be quite as > lopsided as we'd like :-) ) Yes, it's a good move that we can have other ways to fix all the problems without major surgery, and it also looks solid and clean if we have plan to fix any outlier PCIe devices. If there will be a repost after all, not sure if Eric would like to add some of above discussions into either some commit messages or cover letter. Or some comment in the code might be even better. Thanks! -- Peter Xu