On 11/27/24 5:21 AM, Shameerali Kolothum Thodi wrote:


-----Original Message-----
From: Donald Dutile <ddut...@redhat.com>
Sent: Tuesday, November 26, 2024 6:29 PM
To: Nicolin Chen <nicol...@nvidia.com>; Eric Auger
<eric.au...@redhat.com>
Cc: Shameerali Kolothum Thodi
<shameerali.kolothum.th...@huawei.com>; qemu-...@nongnu.org;
qemu-devel@nongnu.org; peter.mayd...@linaro.org; j...@nvidia.com;
Linuxarm <linux...@huawei.com>; Wangzhou (B)
<wangzh...@hisilicon.com>; jiangkunkun <jiangkun...@huawei.com>;
Jonathan Cameron <jonathan.came...@huawei.com>;
zhangfei....@linaro.org
Subject: Re: [RFC PATCH 2/5] hw/arm/smmuv3: Add initial support for
SMMUv3 Nested device



On 11/13/24 1:05 PM, Nicolin Chen wrote:
Hi Eric,

On Wed, Nov 13, 2024 at 06:12:15PM +0100, Eric Auger wrote:
On 11/8/24 13:52, Shameer Kolothum wrote:
@@ -181,6 +181,7 @@ static const MemMapEntry base_memmap[] = {
       [VIRT_PVTIME] =             { 0x090a0000, 0x00010000 },
       [VIRT_SECURE_GPIO] =        { 0x090b0000, 0x00001000 },
       [VIRT_MMIO] =               { 0x0a000000, 0x00000200 },
+    [VIRT_SMMU_NESTED] =        { 0x0b000000, 0x01000000 },

I agree with Mostafa that the _NESTED terminology may not be the best
choice.
The motivation behind that multi-instance attempt, as introduced in
https://lore.kernel.org/all/ZEcT%2F7erkhHDaNvD@Asurada-Nvidia/
was:
- SMMUs with different feature bits
- support of VCMDQ HW extension for SMMU CMDQ
- need for separate S1 invalidation paths

If I understand correctly this is mostly wanted for VCMDQ handling? if
this is correct we may indicate that somehow in the terminology.

If I understand correctly VCMDQ terminology is NVidia specific while
ECMDQ is the baseline (?).

VCMDQ makes a multi-vSMMU-instance design a hard requirement, yet
the point (3) for separate invalidation paths also matters. Jason
suggested VMM in base case to create multi vSMMU instances as the
kernel doc mentioned here:
https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-
next.git/tree/Documentation/userspace-api/iommufd.rst#n84

W.r.t naming, maybe something related to "hardware-accelerated"?

Given that 'accel' has been used for hw-acceleration elsewhere, that seems
like a reasonable 'mode'.
But, it needs a paramater to state was is being accelerated.
i.e., the more global 'accel=kvm' has 'kvm'.

I was thinking more like calling this hw accelerated nested SMMUv3 emulation
as 'smmuv3-accel'.  This avoids confusion with the already existing
'iommu=smmuv3' that also has a nested emulation support.

ie,
-device arm-smmuv3-accel,id=smmuv1,bus=pcie.1 \

I -think- you are saying below, that we have to think a bit more about this
device tagging.  I'm thinking more like
 - device arm-smmuv3,accel=<vcmdq>,id=smmu1,bus=pcie.1 \


For SMMUv3, NVIDIA-specific vCMDQ, it needs a parameter to state that
specifically,
since I'm concluding from reading the SMMUv3 version G.a spec, that
ECMDQ was added
to be able to assign an ECMDQ to a VM,

Not sure the intention of ECMDQ as per that specification is to assign
it to a VM. I think the main idea behind it is to have one Command Queue
per host CPU to eliminate lock contention while submitting commands
to SMMU.

AFAIK it is not safe to assign one of the ECMDQ to guest yet. I think there is 
no
way you can associate a VMID with ECMDQ. So there is no plan to
support ARM ECMDQ now.

NVIDIA VCMDQ is a completely vendor specific one. Perhaps ARM may come
up with an assignable CMDQ in future though.

  and let the VM do CMDQ driven
invalidations via
a similar mechanism as assigned PCI-device mmio space in a VM.
So, how should the QEMU invocation select what parts to 'accel' in the
vSMMUv3 given
to the VM?  ... and given the history of hw-based, virt-acceleration, I can
only guess
more SMMUv3 accel tweaks will be found/desired/implemented.

So, given there is an NVIDIA-specific/like ECMDQ, but different, the accel
parameter
chosen has to consider 'name-space collision', i.e., accel=nv-vcmdq  and
accel=ecmdq,
unless sw can be made to smartly probe and determine the underlying
diffs, and have
equivalent functionality, in which case, a simpler 'accel=vcmdq' could be
used.


Yep. Probably we could abstract that from the user and handle it within
Qemu when the kernel reports the capability based on physical SMMUv3.

Finally, wrt libvirt, how does it know/tell what can and should be used?
For ECMDQ, something under sysfs for an SMMUv3 could expose its
presence/capability/availability
(tag for use/alloc'd for a VM), or an ioctl/cdev i/f to the SMMUv3.
But how does one know today that there's NVIDIA-vCMDQ support on its
SMMUv3? -- is it
exposed in sysfs, ioctl, cdev?

I think the capability will be reported through a IOCTL.  Nicolin ?

... and all needs to be per-instance ....
... libvirt  (or any other VMM orchestrator) will need to determine
compatibility for
      live migration. e.g., can one live migrate an accel=nv-vcmdq-based VM to
a host with
      accel=ecmdq support?  only nv-vcmdq?  what if there are version diffs of
nv-vcmdq over time?
      -- apologies, but I don't know the minute details of nv-vcmdq to
determine if that's unlikely or not.

Yes. This require more thought. But our first aim is get the basic smmuv3-accel
support.

Thanks,
Shameer



Reply via email to