On Fri, Jan 13, 2023 at 09:08:38AM +0000, David Woodhouse wrote:
> I'm looking at interrupt remapping (because I need to hook into the
> translation somehow to add PIRQ support for Xen which translates guest
> MSIs directly to KVM_IRQ_ROUTING_XEN_EVTCHN).
> 
> Am I right in understanding that it doesn't report faults on interrupts
> which can't be translated? It attempts to translate interrupts at the
> time the table is modified (vtd_int_remap()) or when an APIC access
> actually triggers an MSI (vtd_mem_ir_write()) but in neither case does
> it actually raise a fault?
>
> The behaviour we want here is that we only raise a fault when the IRQ
> actually *happens*. But that's hard in our current model where it looks
> like we pretranslate *everything* in advance and just let it run.
> 
> Here's a proposal for a model which could make it work (using VFIO as
> the example since that's the more complex part but it works for
> emulated MSI sources too):
> 
> We consume the VFIO eventfd *both* in userspace and the kernel. (Since 
> https://lore.kernel.org/kvm/20201027143944.648769-1-dw...@infradead.org/
> we can just keep listening on the VFIO eventfd in userspace and the
> kernel will eat all the events so you never notice. On older kernels we
> have to manually stop listening in userspace.)
> 
> When a translation is valid and should be considered 'cached' in the
> IOMMU, that's when we actually hook it up to the irqfd. 
> 
> We can ditch the iec invalidate callbacks (vtd_iec_notify_all) because
> all an invalidation needs to do is KVM_IRQFD_FLAG_DEASSIGN for the
> corresponding GSI.
> 
> (
> You might consider abusing a spare field in the KVM routing table to
> hold a cookie like the IRTE# so that you know *which* entries to
> invalidate. I couldn't possibly comment.
> 
>       /* 64-bit cookie for IOMMU to use for invalidation choices */
>       #define ire_ir_cookie(ire) ((ire)->u.adapter.ind_offset)
> 
>       /* Flags, to indicate a stale entry that needs retranslating
> */
>       #define ire_user_flags(ire) ((ire)->u.adapter.summary_offset)
>       #define IRE_USER_FLAG_STALE             1
> )
> 
> So when an interrupt happens and it's *untranslated*, that's when it
> gets raised to userspace to handle, e.g. in vfio_msi_interrupt(). That
> does the normal thing and attempts to deliver the guest MSI directly.
> We add a flag "bool delivering_now" to the X86IOMMUClass int_remap
> function, to allow it to distinguish between preemptive translations
> and actual delivery and to raise the fault in the latter case.
> 
> When the guest frobs a device's MSI table we can do the translation as
> we do at the moment, of course with the 'delivering_now' argument being
> false. And *if* the translation succeeds then we can install the IRQFD
> right away.
> 
> This model allows us to generate faults as the hardware would, and also
> improves the efficiency of invalidation by only invalidating what we
> need to. I haven't looked hard at how it works with an emulated AMD
> IOMMU, but I know that the Xen PIRQ support (which is where I came in)
> slots into it fairly trivially, using the PIRQ# as the 'cookie' for
> invalidation instead of the IRTE# that the Intel IOMMU uses.
> 

Don't see anything obviously wrong with this.
We really need Alex's input on VFIO though.

-- 
MST


Reply via email to