smmuv3-accel: Add initial infrastructure for smmuv3-accel device

Eric Auger Thu, 27 Mar 2025 01:13:41 -0700

On 3/21/25 2:26 AM, Donald Dutile wrote:
>
>
> On 3/19/25 2:09 PM, Eric Auger wrote:
>> Hi Nicolin,
>>
>>
>> On 3/19/25 6:14 PM, Nicolin Chen wrote:
>>> On Wed, Mar 19, 2025 at 05:45:51PM +0100, Eric Auger wrote:
>>>>
>>>>
>>>> On 3/17/25 8:10 PM, Nicolin Chen wrote:
>>>>> On Mon, Mar 17, 2025 at 07:07:52PM +0100, Eric Auger wrote:
>>>>>> On 3/17/25 6:54 PM, Nicolin Chen wrote:
>>>>>>> On Wed, Mar 12, 2025 at 04:15:10PM +0100, Eric Auger wrote:
>>>>>>>> On 3/11/25 3:10 PM, Shameer Kolothum wrote:
>>>>>>>>> Based on SMMUv3 as a parent device, add a user-creatable
>>>>>>>>> smmuv3-accel
>>>>>>>>> device. In order to support vfio-pci dev assignment with a Guest
>>>>>>>> guest
>>>>>>>>> SMMUv3, the physical SMMUv3 has to be configured in nested(S1+s2)
>>>>>>>> nested (s1+s2)
>>>>>>>>> mode, with Guest owning the S1 page tables. Subsequent patches
>>>>>>>>> will
>>>>>>>> the guest
>>>>>>>>> add support for smmuv3-accel to provide this.
>>>>>>>> Can't this -accel smmu also works with emulated devices? Do we
>>>>>>>> want an
>>>>>>>> exclusive usage?
>>>>>>> Is there any benefit from emulated devices working in the HW-
>>>>>>> accelerated nested translation mode?
>>>>>> Not really but do we have any justification for using different
>>>>>> device
>>>>>> name in accel mode? I am not even sure that accel option is really
>>>>>> needed. Ideally the qemu device should be able to detect it is
>>>>>> protecting a VFIO device, in which case it shall check whether
>>>>>> nested is
>>>>>> supported by host SMMU and then automatically turn accel mode?
>>>>>>
>>>>>> I gave the example of the vfio device which has different class
>>>>>> implementration depending on the iommufd option being set or not.
>>>>> Do you mean that we should just create a regular smmuv3 device and
>>>>> let a VFIO device to turn on this smmuv3's accel mode depending on
>>>>> its LEGACY/IOMMUFD class?
>>>> no this is not what I meant. I gave an example where depending on an
>>>> option passed to thye VFIO device you choose one class implement or
>>>> the
>>>> other.
>>> Option means something like this:
>>>     -device smmuv3,accel=on
>>> instead of
>>>     -device "smmuv3-accel"
>>> ?
>>>
>>> Yea, I think that's good.
>> Yeah actually that's a big debate for not much. From an implementation
>> pov that shall not change much. The only doubt I have is if we need to
>> conditionnaly expose the MSI RESV regions it is easier to do if we
>> detect we have a smmuv3-accel. what the option allows is the auto mode.
>>>
>>>>> Another question: how does an emulated device work with a vSMMUv3?
>>>> I don't get your question. vSMMUv3 currently only works with emulated
>>>> devices. Did you mean accelerated SMMUv3?
>>> Yea. If "accel=on", how does an emulated device work with that?
>>>
>>>>> I could imagine that all the accel steps would be bypassed since
>>>>> !sdev->idev. Yet, the emulated iotlb should cache its translation
>>>>> so we will need to flush the iotlb, which will increase complexity
>>>>> as the TLBI command dispatching function will need to be aware what
>>>>> ASID is for emulated device and what is for vfio device..
>>>> I don't get the issue. For emulated device you go through the usual
>>>> translate path which indeed caches configs and translations. In
>>>> case the
>>>> guest invalidates something, you know the SID and you find the entries
>>>> in the cache that are tagged by this SID.
>>>>
>>>> In case you have an accelerated device (indeed if sdev->idev) you
>>>> don't
>>>> exercise that path. On invalidation you detect the SID matches a VFIO
>>>> devoce, propagate the invalidations to the host instead. on the
>>>> invalidation you should be able to detect pretty easily if you need to
>>>> flush the emulated caches or propagate the invalidations. Do I miss
>>>> some
>>>> extra problematic?
>>>>
>>>> I do not say we should support emulated devices and VFIO devices in
>>>> the
>>>> same guest iommu group. But I don't see why we couldn't easily plug
>>>> the
>>>> accelerated logic in the current logical for emulation/vhost and do
>>>> not
>>>> require a different qemu device.
>>> Hmm, feels like I fundamentally misunderstood your point.
>>>   a) We implement the device model with the same piece of code but
>>>      only provide an option "accel=on/off" to switch mode. And both
>>>      passthrough devices and emulated devices can attach to the same
>>>      "accel=on" device.
>> I think we all agree we don't want that use case in general. However
>> effectively I was questioning why it couldn't work maybe at the expense
>> of some perf degration.
>>>   b) We implement the device model with the same piece of code but
>>>      only provide an option "accel=on/off" to switch mode. Then, an
>>>      passthrough device can attach to an "accel=on" device, but an
>>>      emulated device can only attach to an "accel=off" SMMU device.
>>>
>>> I was thinking that you want case (a). But actually you were just
>>> talking about case (b)? I think (b) is totally fine.
>>>
>>> We certainly can't do case (a): not all TLBI commands gives an "SID"
>>> field (so would have to broadcast, i.e. underlying SMMU HW would run
>>> commands that were supposed for emulated devices only); in case of
>>> vCMDQ, commands for emulated devices would be issued to real HW and
>> I am still confused about that. For instance if the guest sends an
>> NH_ASID, NH_VA invalidation and it happens both the emulated device and
>> VFIO-device share the same cd.asid (same guest iommu domain, which
>> practically should not happen) why shouldn't we propagate the
> it can't ... on ARM ... PCIe only, no shared iommu domain btwn devices.
yeah I agree this generally happens behind a PCIe to PCI bridge.
>
> Isn't this another reason (perf) why emulated devices & physical
> devices should
> be on different vSMMU's ... so it can be distinguished on how deep (to
> hw)
> or how wide(a broadcast) actions like TLBI is implemented, or impacts
> other devices ?
To me the actual issue is vcmdq. Here we have a blocker. Otherwise if
you don't have vcmdq you still can propage invalidations using the
proper notifier (VFIO or vhost). This used to work

Eric
>
>
>> invalidation to the host. Does the problem come from the usage of vCMDQ
>> or would you foresee the same problem with a generic physical SMMU?
>>
>> Thanks
>>
>> Eric
>>> trigger HW errors.
>>>
>>> Thanks
>>> Nicolin
>>>
>>
>
Re: [RFC PATCH v2 03/20] hw/arm/smmuv3-accel: Add initial infrastructure for smmuv3-accel device

Reply via email to