On Tue, Mar 18, 2025 at 07:31:36PM +0100, Eric Auger wrote: > On 3/17/25 9:19 PM, Nicolin Chen wrote: > > On Mon, Mar 17, 2025 at 04:24:53PM -0300, Jason Gunthorpe wrote: > >> On Mon, Mar 17, 2025 at 12:10:19PM -0700, Nicolin Chen wrote: > >>> Another question: how does an emulated device work with a vSMMUv3? > >>> I could imagine that all the accel steps would be bypassed since > >>> !sdev->idev. Yet, the emulated iotlb should cache its translation > >>> so we will need to flush the iotlb, which will increase complexity > >>> as the TLBI command dispatching function will need to be aware what > >>> ASID is for emulated device and what is for vfio device.. > >> I think you should block it. We already expect different vSMMU's > >> depending on the physical SMMU under the PCI device, it makes sense > >> that a SW VFIO device would have it's own, non-accelerated, vSMMU > >> model in the guest. > > Yea, I agree and it'd be cleaner for an implementation separating > > them. > > > > In my mind, the general idea of "accel=on" is also to keep things > > in a more efficient way: passthrough devices go to HW-accelerated > > vSMMUs (separated PCIE buses), while emulated ones go to a vSMMU- > > bypassed (PCIE0).
> Originally a specific SMMU device was needed to opt in for MSI reserved > region ACPI IORT description which are not needed if you don't rely on > S1+S2. However if we don't rely on this trick this was not even needed > with legacy integration > (https://patchwork.kernel.org/project/qemu-devel/cover/20180921081819.9203-1-eric.au...@redhat.com/). > > Nevertheless I don't think anything prevents the acceleration granted > device from also working with virtio/vhost devices for instance unless > you unplug the existing infra. The translation and invalidation just > should use different control paths (explicit translation requests, > invalidation notifications towards vhost, ...). smmuv3_translate() is per sdev, so it's easy. Invalidation is done via commands, which could be tricky: a) Broadcast command b) ASID validation -- we'll need to keep track of a list of ASIDs for vfio device to compare the ASID in each per-ASID command, potentially by trapping all CFGI_CD(_ALL) commands? Note that each vfio device may have multiple ASIDs (for multiple CDs). Either a or b above will have some validation efficiency impact. > Again, what does legitimate to have different qemu devices for the same > IP? I understand that it simplifies the implementation but I am not sure > this is a good reason. Nevertheless it worth challenging. What is the > plan for intel iommu? Will we have 2 devices, the legacy device and one > for nested? Hmm, it seems that there are two different topics: 1. Use one SMMU device model (source code file; "iommu=" string) for both an emulated vSMMU and an HW-accelerated vSMMU. 2. Allow one vSMMU instance to work with both an emulated device and a passthrough device. And I get that you want both 1 and 2. I'm totally okay with 1, yet see no compelling benefit from 2 for the increased complexity in the invalidation routine. And another question about the mixed device attachment. Let's say we have in the host: VFIO passthrough dev0 -> pSMMU0 VFIO passthrough dev1 -> pSMMU1 Should we allow emulated devices to be flexibly plugged? dev0 -> vSMMU0 /* Hard requirement */ dev1 -> vSMMU1 /* Hard requirement */ emu0 -> vSMMU0 /* Soft requirement; can be vSMMU1 also */ emu1 -> vSMMU1 /* Soft requirement; can be vSMMU0 also */ Thanks Nicolin