On Tue, Mar 18, 2025 at 07:31:36PM +0100, Eric Auger wrote:
> On 3/17/25 9:19 PM, Nicolin Chen wrote:
> > On Mon, Mar 17, 2025 at 04:24:53PM -0300, Jason Gunthorpe wrote:
> >> On Mon, Mar 17, 2025 at 12:10:19PM -0700, Nicolin Chen wrote:
> >>> Another question: how does an emulated device work with a vSMMUv3?
> >>> I could imagine that all the accel steps would be bypassed since
> >>> !sdev->idev. Yet, the emulated iotlb should cache its translation
> >>> so we will need to flush the iotlb, which will increase complexity
> >>> as the TLBI command dispatching function will need to be aware what
> >>> ASID is for emulated device and what is for vfio device..
> >> I think you should block it. We already expect different vSMMU's
> >> depending on the physical SMMU under the PCI device, it makes sense
> >> that a SW VFIO device would have it's own, non-accelerated, vSMMU
> >> model in the guest.
> > Yea, I agree and it'd be cleaner for an implementation separating
> > them.
> >
> > In my mind, the general idea of "accel=on" is also to keep things
> > in a more efficient way: passthrough devices go to HW-accelerated
> > vSMMUs (separated PCIE buses), while emulated ones go to a vSMMU-
> > bypassed (PCIE0).

> Originally a specific SMMU device was needed to opt in for MSI reserved
> region ACPI IORT description which are not needed if you don't rely on
> S1+S2. However if we don't rely on this trick this was not even needed
> with legacy integration
> (https://patchwork.kernel.org/project/qemu-devel/cover/20180921081819.9203-1-eric.au...@redhat.com/).
> 
> Nevertheless I don't think anything prevents the acceleration granted
> device from also working with virtio/vhost devices for instance unless
> you unplug the existing infra. The translation and invalidation just
> should use different control paths (explicit translation requests,
> invalidation notifications towards vhost, ...).

smmuv3_translate() is per sdev, so it's easy.

Invalidation is done via commands, which could be tricky:
a) Broadcast command
b) ASID validation -- we'll need to keep track of a list of ASIDs
   for vfio device to compare the ASID in each per-ASID command,
   potentially by trapping all CFGI_CD(_ALL) commands? Note that
   each vfio device may have multiple ASIDs (for multiple CDs).
Either a or b above will have some validation efficiency impact.

> Again, what does legitimate to have different qemu devices for the same
> IP? I understand that it simplifies the implementation but I am not sure
> this is a good reason. Nevertheless it worth challenging. What is the
> plan for intel iommu? Will we have 2 devices, the legacy device and one
> for nested?

Hmm, it seems that there are two different topics:
1. Use one SMMU device model (source code file; "iommu=" string)
   for both an emulated vSMMU and an HW-accelerated vSMMU.
2. Allow one vSMMU instance to work with both an emulated device
   and a passthrough device.
And I get that you want both 1 and 2.

I'm totally okay with 1, yet see no compelling benefit from 2 for
the increased complexity in the invalidation routine.

And another question about the mixed device attachment. Let's say
we have in the host:
  VFIO passthrough dev0 -> pSMMU0
  VFIO passthrough dev1 -> pSMMU1
Should we allow emulated devices to be flexibly plugged?
  dev0 -> vSMMU0 /* Hard requirement */
  dev1 -> vSMMU1 /* Hard requirement */
  emu0 -> vSMMU0 /* Soft requirement; can be vSMMU1 also */
  emu1 -> vSMMU1 /* Soft requirement; can be vSMMU0 also */

Thanks
Nicolin

Reply via email to