smmuv3: Add initial support for SMMUv3 Nested device

Shameerali Kolothum Thodi via Wed, 27 Nov 2024 02:22:44 -0800


> -----Original Message-----
> From: Donald Dutile <ddut...@redhat.com>
> Sent: Tuesday, November 26, 2024 6:29 PM
> To: Nicolin Chen <nicol...@nvidia.com>; Eric Auger
> <eric.au...@redhat.com>
> Cc: Shameerali Kolothum Thodi
> <shameerali.kolothum.th...@huawei.com>; qemu-...@nongnu.org;
> qemu-devel@nongnu.org; peter.mayd...@linaro.org; j...@nvidia.com;
> Linuxarm <linux...@huawei.com>; Wangzhou (B)
> <wangzh...@hisilicon.com>; jiangkunkun <jiangkun...@huawei.com>;
> Jonathan Cameron <jonathan.came...@huawei.com>;
> zhangfei....@linaro.org
> Subject: Re: [RFC PATCH 2/5] hw/arm/smmuv3: Add initial support for
> SMMUv3 Nested device
> 
> 
> 
> On 11/13/24 1:05 PM, Nicolin Chen wrote:
> > Hi Eric,
> >
> > On Wed, Nov 13, 2024 at 06:12:15PM +0100, Eric Auger wrote:
> >> On 11/8/24 13:52, Shameer Kolothum wrote:
> >>> @@ -181,6 +181,7 @@ static const MemMapEntry base_memmap[] = {
> >>>       [VIRT_PVTIME] =             { 0x090a0000, 0x00010000 },
> >>>       [VIRT_SECURE_GPIO] =        { 0x090b0000, 0x00001000 },
> >>>       [VIRT_MMIO] =               { 0x0a000000, 0x00000200 },
> >>> +    [VIRT_SMMU_NESTED] =        { 0x0b000000, 0x01000000 },
> >
> >> I agree with Mostafa that the _NESTED terminology may not be the best
> >> choice.
> >> The motivation behind that multi-instance attempt, as introduced in
> >> https://lore.kernel.org/all/ZEcT%2F7erkhHDaNvD@Asurada-Nvidia/
> >> was:
> >> - SMMUs with different feature bits
> >> - support of VCMDQ HW extension for SMMU CMDQ
> >> - need for separate S1 invalidation paths
> >>
> >> If I understand correctly this is mostly wanted for VCMDQ handling? if
> >> this is correct we may indicate that somehow in the terminology.
> >>
> >> If I understand correctly VCMDQ terminology is NVidia specific while
> >> ECMDQ is the baseline (?).
> >
> > VCMDQ makes a multi-vSMMU-instance design a hard requirement, yet
> > the point (3) for separate invalidation paths also matters. Jason
> > suggested VMM in base case to create multi vSMMU instances as the
> > kernel doc mentioned here:
> > https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-
> next.git/tree/Documentation/userspace-api/iommufd.rst#n84
> >
> > W.r.t naming, maybe something related to "hardware-accelerated"?
> >
> Given that 'accel' has been used for hw-acceleration elsewhere, that seems
> like a reasonable 'mode'.
> But, it needs a paramater to state was is being accelerated.
> i.e., the more global 'accel=kvm' has 'kvm'.


I was thinking more like calling this hw accelerated nested SMMUv3 emulation
as 'smmuv3-accel'.  This avoids confusion with the already existing 
'iommu=smmuv3' that also has a nested emulation support. 

ie,
-device arm-smmuv3-accel,id=smmuv1,bus=pcie.1 \

> 
> For SMMUv3, NVIDIA-specific vCMDQ, it needs a parameter to state that
> specifically,
> since I'm concluding from reading the SMMUv3 version G.a spec, that
> ECMDQ was added
> to be able to assign an ECMDQ to a VM,

Not sure the intention of ECMDQ as per that specification is to assign
it to a VM. I think the main idea behind it is to have one Command Queue 
per host CPU to eliminate lock contention while submitting commands
to SMMU.

AFAIK it is not safe to assign one of the ECMDQ to guest yet. I think there is 
no
way you can associate a VMID with ECMDQ. So there is no plan to
support ARM ECMDQ now.

NVIDIA VCMDQ is a completely vendor specific one. Perhaps ARM may come
up with an assignable CMDQ in future though.

 and let the VM do CMDQ driven
> invalidations via
> a similar mechanism as assigned PCI-device mmio space in a VM.
> So, how should the QEMU invocation select what parts to 'accel' in the
> vSMMUv3 given
> to the VM?  ... and given the history of hw-based, virt-acceleration, I can
> only guess
> more SMMUv3 accel tweaks will be found/desired/implemented.
> 
> So, given there is an NVIDIA-specific/like ECMDQ, but different, the accel
> parameter
> chosen has to consider 'name-space collision', i.e., accel=nv-vcmdq  and
> accel=ecmdq,
> unless sw can be made to smartly probe and determine the underlying
> diffs, and have
> equivalent functionality, in which case, a simpler 'accel=vcmdq' could be
> used.
> 

Yep. Probably we could abstract that from the user and handle it within
Qemu when the kernel reports the capability based on physical SMMUv3.

> Finally, wrt libvirt, how does it know/tell what can and should be used?
> For ECMDQ, something under sysfs for an SMMUv3 could expose its
> presence/capability/availability
> (tag for use/alloc'd for a VM), or an ioctl/cdev i/f to the SMMUv3.
> But how does one know today that there's NVIDIA-vCMDQ support on its
> SMMUv3? -- is it
> exposed in sysfs, ioctl, cdev?

I think the capability will be reported through a IOCTL.  Nicolin ?

> ... and all needs to be per-instance ....
> ... libvirt  (or any other VMM orchestrator) will need to determine
> compatibility for
>      live migration. e.g., can one live migrate an accel=nv-vcmdq-based VM to
> a host with
>      accel=ecmdq support?  only nv-vcmdq?  what if there are version diffs of
> nv-vcmdq over time?
>      -- apologies, but I don't know the minute details of nv-vcmdq to
> determine if that's unlikely or not.

Yes. This require more thought. But our first aim is get the basic smmuv3-accel
support.

Thanks,
Shameer

RE: [RFC PATCH 2/5] hw/arm/smmuv3: Add initial support for SMMUv3 Nested device

Reply via email to