On 3/19/25 2:09 PM, Eric Auger wrote:
Hi Nicolin,
On 3/19/25 6:14 PM, Nicolin Chen wrote:
On Wed, Mar 19, 2025 at 05:45:51PM +0100, Eric Auger wrote:
On 3/17/25 8:10 PM, Nicolin Chen wrote:
On Mon, Mar 17, 2025 at 07:07:52PM +0100, Eric Auger wrote:
On 3/17/25 6:54 PM, Nicolin Chen wrote:
On Wed, Mar 12, 2025 at 04:15:10PM +0100, Eric Auger wrote:
On 3/11/25 3:10 PM, Shameer Kolothum wrote:
Based on SMMUv3 as a parent device, add a user-creatable smmuv3-accel
device. In order to support vfio-pci dev assignment with a Guest
guest
SMMUv3, the physical SMMUv3 has to be configured in nested(S1+s2)
nested (s1+s2)
mode, with Guest owning the S1 page tables. Subsequent patches will
the guest
add support for smmuv3-accel to provide this.
Can't this -accel smmu also works with emulated devices? Do we want an
exclusive usage?
Is there any benefit from emulated devices working in the HW-
accelerated nested translation mode?
Not really but do we have any justification for using different device
name in accel mode? I am not even sure that accel option is really
needed. Ideally the qemu device should be able to detect it is
protecting a VFIO device, in which case it shall check whether nested is
supported by host SMMU and then automatically turn accel mode?
I gave the example of the vfio device which has different class
implementration depending on the iommufd option being set or not.
Do you mean that we should just create a regular smmuv3 device and
let a VFIO device to turn on this smmuv3's accel mode depending on
its LEGACY/IOMMUFD class?
no this is not what I meant. I gave an example where depending on an
option passed to thye VFIO device you choose one class implement or the
other.
Option means something like this:
-device smmuv3,accel=on
instead of
-device "smmuv3-accel"
?
Yea, I think that's good.
Yeah actually that's a big debate for not much. From an implementation
pov that shall not change much. The only doubt I have is if we need to
conditionnaly expose the MSI RESV regions it is easier to do if we
detect we have a smmuv3-accel. what the option allows is the auto mode.
Another question: how does an emulated device work with a vSMMUv3?
I don't get your question. vSMMUv3 currently only works with emulated
devices. Did you mean accelerated SMMUv3?
Yea. If "accel=on", how does an emulated device work with that?
I could imagine that all the accel steps would be bypassed since
!sdev->idev. Yet, the emulated iotlb should cache its translation
so we will need to flush the iotlb, which will increase complexity
as the TLBI command dispatching function will need to be aware what
ASID is for emulated device and what is for vfio device..
I don't get the issue. For emulated device you go through the usual
translate path which indeed caches configs and translations. In case the
guest invalidates something, you know the SID and you find the entries
in the cache that are tagged by this SID.
In case you have an accelerated device (indeed if sdev->idev) you don't
exercise that path. On invalidation you detect the SID matches a VFIO
devoce, propagate the invalidations to the host instead. on the
invalidation you should be able to detect pretty easily if you need to
flush the emulated caches or propagate the invalidations. Do I miss some
extra problematic?
I do not say we should support emulated devices and VFIO devices in the
same guest iommu group. But I don't see why we couldn't easily plug the
accelerated logic in the current logical for emulation/vhost and do not
require a different qemu device.
Hmm, feels like I fundamentally misunderstood your point.
a) We implement the device model with the same piece of code but
only provide an option "accel=on/off" to switch mode. And both
passthrough devices and emulated devices can attach to the same
"accel=on" device.
I think we all agree we don't want that use case in general. However
effectively I was questioning why it couldn't work maybe at the expense
of some perf degration.
b) We implement the device model with the same piece of code but
only provide an option "accel=on/off" to switch mode. Then, an
passthrough device can attach to an "accel=on" device, but an
emulated device can only attach to an "accel=off" SMMU device.
I was thinking that you want case (a). But actually you were just
talking about case (b)? I think (b) is totally fine.
We certainly can't do case (a): not all TLBI commands gives an "SID"
field (so would have to broadcast, i.e. underlying SMMU HW would run
commands that were supposed for emulated devices only); in case of
vCMDQ, commands for emulated devices would be issued to real HW and
I am still confused about that. For instance if the guest sends an
NH_ASID, NH_VA invalidation and it happens both the emulated device and
VFIO-device share the same cd.asid (same guest iommu domain, which
practically should not happen) why shouldn't we propagate the
it can't ... on ARM ... PCIe only, no shared iommu domain btwn devices.
Isn't this another reason (perf) why emulated devices & physical devices should
be on different vSMMU's ... so it can be distinguished on how deep (to hw)
or how wide(a broadcast) actions like TLBI is implemented, or impacts other
devices ?
invalidation to the host. Does the problem come from the usage of vCMDQ
or would you foresee the same problem with a generic physical SMMU?
Thanks
Eric
trigger HW errors.
Thanks
Nicolin