Hi,

On 3/11/25 3:10 PM, Shameer Kolothum wrote:
> Hi All,
>
> This patch series introduces initial support for a user-creatable
> accelerated SMMUv3 device (-device arm-smmuv3-accel) in QEMU.
>
> Why this is needed:
>
> Currently, QEMU’s ARM SMMUv3 emulation (iommu=smmuv3) is tied to the
> machine and does not support configuring the host SMMUv3 in nested
> mode.This limitation prevents its use with vfio-pci passthrough
> devices.
>
> The new pluggable smmuv3-accel device enables host SMMUv3 configuration
> with nested stage support (Stage 1 owned by the Guest and Stage 2 by the
> host) via the new IOMMUFD APIs. Additionally, it allows multiple 
> accelerated vSMMUv3 instances for guests running on hosts with multiple
> physical SMMUv3s.
>
> This will benefit in:
> -Reduced invalidation broadcasts and lookups for devices behind multiple
>  physical SMMUv3s.
> -Simplifies handling of host SMMUv3s with differing feature sets.
> -Lays the groundwork for additional capabilities like vCMDQ support.
>
>
> Changes from RFCv1[0]:
>
> Thanks to everyone who provided feedback on RFCv1!. 
>
> –The device is now called arm-smmuv3-accel instead of arm-smmuv3-nested
>  to better reflect its role in using the host's physical SMMUv3 for page
>  table setup and cache invalidations.
> -Includes patches for VIOMMU and VDEVICE IOMMUFD APIs (patches 1,2).
> -Merges patches from Nicolin’s GitHub repository that add accelerated
>  functionalityi for page table setup and cache invalidations[1]. I have
>  modified these a bit, but hopefully has not broken anything.
> -Incorporates various fixes and improvements based on RFCv1 feedback.
> –Adds support for vfio-pci hotplug with smmuv3-accel.
>
> Note: IORT RMR patches for MSI setup are currently excluded as we may
> adopt a different approach for MSI handling in the future [2].
>
> Also this has dependency on the common iommufd/vfio patches from
> Zhenzhong's series here[3]
>
> ToDos:
>
> –At least one vfio-pci device must currently be cold-plugged to a
>  pxb-pcie bus associated with the arm-smmuv3-accel. This is required both
>  to associate a vSMMUv3 with a host SMMUv3 and also needed to
>  retrieve the host SMMUv3 IDR registers for guest export.
>  Future updates will remove this restriction by adding the
>  necessary kernel support.
>  Please find the discussion here[4]
> -This version does not yet support host SMMUv3 fault handling or
>  other event notifications. These will be addressed in a
>  future patch series.
>
>
> The complete branch can be found here:
> https://github.com/hisilicon/qemu/tree/master-smmuv3-accel-rfcv2-ext
>
> I have done basic sanity testing on a Hisilicon Platform using the kernel
> branch here:
> https://github.com/nicolinc/iommufd/tree/iommufd_msi-rfcv2
>
> Usage Eg:
>
> On a HiSilicon platform that has multiple host SMMUv3s, the ACC ZIP VF
> devices and HNS VF devices are behind different host SMMUv3s. So for a
> Guest, specify two arm-smmuv3-accel devices each behind a pxb-pcie as below,
>
>
> ./qemu-system-aarch64 -machine virt,accel=kvm,gic-version=3 \
> -cpu host -smp cpus=4 -m size=4G,slots=4,maxmem=256G \
> -bios QEMU_EFI.fd \
> -object iommufd,id=iommufd0 \
> -device virtio-blk-device,drive=fs \
> -drive if=none,file=rootfs.qcow2,id=fs \
> -device pxb-pcie,id=pcie.1,bus_nr=1,bus=pcie.0 \
> -device arm-smmuv3-accel,bus=pcie.1 \
> -device 
> pcie-root-port,id=pcie.port1,bus=pcie.1,chassis=1,pref64-reserve=2M,io-reserve=1K
>  \
> -device vfio-pci,host=0000:7d:02.1,bus=pcie.port1,iommufd=iommufd0 \
> -device 
> pcie-root-port,id=pcie.port2,bus=pcie.1,chassis=2,pref64-reserve=2M,io-reserve=1K
>  \
> -device vfio-pci,host=0000:7d:02.2,bus=pcie.port2,iommufd=iommufd0 \
> -device pxb-pcie,id=pcie.2,bus_nr=8,bus=pcie.0 \
> -device arm-smmuv3-accel,bus=pcie.2 \
> -device 
> pcie-root-port,id=pcie.port3,bus=pcie.2,chassis=3,pref64-reserve=2M,io-reserve=1K
>  \
> -device vfio-pci,host=0000:75:00.1,bus=pcie.port3,iommufd=iommufd0 \
> -kernel Image \
> -append "rdinit=init console=ttyAMA0 root=/dev/vda2 rw 
> earlycon=pl011,0x9000000" \
> -device virtio-9p-pci,fsdev=p9fs,mount_tag=p9,bus=pcie.0 \
> -fsdev local,id=p9fs,path=p9root,security_model=mapped \
> -net none \
> -nographic
>
> Guest will boot with two SMMUv3s,
> ...
> arm-smmu-v3 arm-smmu-v3.0.auto: option mask 0x0
> arm-smmu-v3 arm-smmu-v3.0.auto: ias 44-bit, oas 44-bit (features 0x00008325)
> arm-smmu-v3 arm-smmu-v3.0.auto: allocated 65536 entries for cmdq
> arm-smmu-v3 arm-smmu-v3.0.auto: allocated 32768 entries for evtq
> arm-smmu-v3 arm-smmu-v3.1.auto: option mask 0x0
> arm-smmu-v3 arm-smmu-v3.1.auto: ias 44-bit, oas 44-bit (features 0x00008325)
> arm-smmu-v3 arm-smmu-v3.1.auto: allocated 65536 entries for cmdq
> arm-smmu-v3 arm-smmu-v3.1.auto: allocated 32768 entries for evtq
>
> With a pci topology like below,
>
> [root@localhost ~]# lspci -tv
> -+-[0000:00]-+-00.0  Red Hat, Inc. QEMU PCIe Host bridge
>  |           +-01.0  Red Hat, Inc. QEMU PCIe Expander bridge
>  |           +-02.0  Red Hat, Inc. QEMU PCIe Expander bridge
>  |           \-03.0  Virtio: Virtio filesystem
>  +-[0000:01]-+-00.0-[02]----00.0  Huawei Technologies Co., Ltd. HNS Network 
> Controller (Virtual Function)
>  |           \-01.0-[03]----00.0  Huawei Technologies Co., Ltd. HNS Network 
> Controller (Virtual Function)
>  \-[0000:08]---00.0-[09]----00.0  Huawei Technologies Co., Ltd. HiSilicon ZIP 
> Engine(Virtual Function)

For the record I tested the series with host VFIO device and a
virtio-blk-pci device put behind the same pxb-pcie/smmu protection and
it works just fine

-+-[0000:0a]-+-01.0-[0b]----00.0  Mellanox Technologies ConnectX Family
mlx5Gen Virtual Function
 |           \-01.1-[0c]----00.0  Red Hat, Inc. Virtio 1.0 block device
 \-[0000:00]-+-00.0  Red Hat, Inc. QEMU PCIe Host bridge
             +-01.0-[01]--
             +-01.1-[02]--
             \-02.0  Red Hat, Inc. QEMU PCIe Expander bridge

This shows that without vcmdq feature there is no blocker having the
same smmu device protecting both accelerated and emulated devices.

Thanks

Eric
>
> Further tests are always welcome.
>
> Please take a look and let me know your feedback!
>
> Thanks,
> Shameer
>
> [0] 
> https://lore.kernel.org/qemu-devel/20241108125242.60136-1-shameerali.kolothum.th...@huawei.com/
> [1] 
> https://github.com/nicolinc/qemu/commit/3acbb7f3d114d6bb70f4895aa66a9ec28e6561d6
> [2] 
> https://lore.kernel.org/linux-iommu/cover.1740014950.git.nicol...@nvidia.com/
> [3] 
> https://lore.kernel.org/qemu-devel/20250219082228.3303163-1-zhenzhong.d...@intel.com/
> [4] https://lore.kernel.org/qemu-devel/z6tlsdwgajmhv...@redhat.com/
>
> Nicolin Chen (11):
>   backends/iommufd: Introduce iommufd_backend_alloc_viommu
>   backends/iommufd: Introduce iommufd_vdev_alloc
>   hw/arm/smmuv3-accel: Add set/unset_iommu_device callback
>   hw/arm/smmuv3-accel: Support nested STE install/uninstall support
>   hw/arm/smmuv3-accel: Allocate a vDEVICE object for device
>   hw/arm/smmuv3-accel: Return sysmem if stage-1 is bypassed
>   hw/arm/smmuv3-accel: Introduce helpers to batch and issue cache
>     invalidations
>   hw/arm/smmuv3: Forward invalidation commands to hw
>   hw/arm/smmuv3-accel: Read host SMMUv3 device info
>   hw/arm/smmuv3: Check idr registers for STE_S1CDMAX and STE_S1STALLD
>   hw/arm/smmu-common: Bypass emulated IOTLB for a accel SMMUv3
>
> Shameer Kolothum (9):
>   hw/arm/smmuv3-accel: Add initial infrastructure for smmuv3-accel
>     device
>   hw/arm/virt: Add support for smmuv3-accel
>   hw/arm/smmuv3-accel: Associate a pxb-pcie bus
>   hw/arm/smmu-common: Factor out common helper functions and export
>   hw/arm/smmu-common: Introduce callbacks for PCIIOMMUOps
>   hw/arm/smmuv3-accel: Provide get_address_space callback
>   hw/arm/smmuv3: Install nested ste for CFGI_STE
>   hw/arm/virt-acpi-build: Update IORT with multiple smmuv3-accel nodes
>   hw/arm/smmuv3-accel: Enable smmuv3-accel creation
>
>  backends/iommufd.c            |  51 +++
>  backends/trace-events         |   2 +
>  hw/arm/Kconfig                |   5 +
>  hw/arm/meson.build            |   1 +
>  hw/arm/smmu-common.c          |  95 +++++-
>  hw/arm/smmuv3-accel.c         | 616 ++++++++++++++++++++++++++++++++++
>  hw/arm/smmuv3-internal.h      |  54 +++
>  hw/arm/smmuv3.c               |  80 ++++-
>  hw/arm/trace-events           |   6 +
>  hw/arm/virt-acpi-build.c      | 113 ++++++-
>  hw/arm/virt.c                 |  12 +
>  hw/core/sysbus-fdt.c          |   1 +
>  include/hw/arm/smmu-common.h  |  14 +
>  include/hw/arm/smmuv3-accel.h |  75 +++++
>  include/hw/arm/virt.h         |   1 +
>  include/system/iommufd.h      |  14 +
>  16 files changed, 1101 insertions(+), 39 deletions(-)
>  create mode 100644 hw/arm/smmuv3-accel.c
>  create mode 100644 include/hw/arm/smmuv3-accel.h
>


Reply via email to