On 11/13/24 1:05 PM, Nicolin Chen wrote:
Hi Eric,
On Wed, Nov 13, 2024 at 06:12:15PM +0100, Eric Auger wrote:
On 11/8/24 13:52, Shameer Kolothum wrote:
@@ -181,6 +181,7 @@ static const MemMapEntry base_memmap[] = {
[VIRT_PVTIME] = { 0x090a0000, 0x00010000 },
[VIRT_SECURE_GPIO] = { 0x090b0000, 0x00001000 },
[VIRT_MMIO] = { 0x0a000000, 0x00000200 },
+ [VIRT_SMMU_NESTED] = { 0x0b000000, 0x01000000 },
I agree with Mostafa that the _NESTED terminology may not be the best
The motivation behind that multi-instance attempt, as introduced in
- SMMUs with different feature bits
- support of VCMDQ HW extension for SMMU CMDQ
- need for separate S1 invalidation paths
If I understand correctly this is mostly wanted for VCMDQ handling? if
this is correct we may indicate that somehow in the terminology.
If I understand correctly VCMDQ terminology is NVidia specific while
ECMDQ is the baseline (?).
VCMDQ makes a multi-vSMMU-instance design a hard requirement, yet
the point (3) for separate invalidation paths also matters. Jason
suggested VMM in base case to create multi vSMMU instances as the
kernel doc mentioned here:
W.r.t naming, maybe something related to "hardware-accelerated"?
Given that 'accel' has been used for hw-acceleration elsewhere, that seems like
a reasonable 'mode'.
But, it needs a paramater to state was is being accelerated.
i.e., the more global 'accel=kvm' has 'kvm'.
For SMMUv3, NVIDIA-specific vCMDQ, it needs a parameter to state that
since I'm concluding from reading the SMMUv3 version G.a spec, that ECMDQ was
to be able to assign an ECMDQ to a VM, and let the VM do CMDQ driven
invalidations via
a similar mechanism as assigned PCI-device mmio space in a VM.
So, how should the QEMU invocation select what parts to 'accel' in the vSMMUv3
to the VM? ... and given the history of hw-based, virt-acceleration, I can
only guess
more SMMUv3 accel tweaks will be found/desired/implemented.
So, given there is an NVIDIA-specific/like ECMDQ, but different, the accel
chosen has to consider 'name-space collision', i.e., accel=nv-vcmdq and
unless sw can be made to smartly probe and determine the underlying diffs, and
equivalent functionality, in which case, a simpler 'accel=vcmdq' could be used.
Finally, wrt libvirt, how does it know/tell what can and should be used?
For ECMDQ, something under sysfs for an SMMUv3 could expose its
(tag for use/alloc'd for a VM), or an ioctl/cdev i/f to the SMMUv3.
But how does one know today that there's NVIDIA-vCMDQ support on its SMMUv3? --
is it
exposed in sysfs, ioctl, cdev?
... and all needs to be per-instance ....
... libvirt (or any other VMM orchestrator) will need to determine
compatibility for
live migration. e.g., can one live migrate an accel=nv-vcmdq-based VM to a
host with
accel=ecmdq support? only nv-vcmdq? what if there are version diffs of
nv-vcmdq over time?
-- apologies, but I don't know the minute details of nv-vcmdq to determine
if that's unlikely or not.
Once the qemu-smmuv3-api is defined, with the recognition of what libvirt (or
any other VMM) needs to probe/check/use for hw-accelerated features,
I think it'll be more straight-fwd to implement, and (clearly) understand from
a qemu command line. :)
- Don