Hi Zhenzhong

On 8/30/23 12:37, Zhenzhong Duan wrote:
> Hi All,
>
> As the kernel side iommufd cdev and hot reset feature have been queued,
> also hwpt alloc has been added in Jason's for_next branch [1], I'd like
> to update a new version matching kernel side update and with rfc flag
> removed. Qemu code can be found at [2], look forward more comments!
>
>
> We have done wide test with different combinations, e.g:
>
> - PCI device were tested
> - FD passing and hot reset with some trick.
> - device hotplug test with legacy and iommufd backends
> - with or without vIOMMU for legacy and iommufd backends
> - divices linked to different iommufds
> - VFIO migration with a E800 net card(no dirty sync support) passthrough
> - platform, ccw and ap were only compile-tested due to environment limit
>
>
> Given some iommufd kernel limitations, the iommufd backend is
> not yet fully on par with the legacy backend w.r.t. features like:
> - p2p mappings (you will see related error traces)
> - dirty page sync
> - and etc.
>
>
> Changelog:
> v1:
> - Alloc hwpt instead of using auto hwpt
> - elaborate iommufd code per Nicolin
> - consolidate two patches and drop as.c
> - typo error fix and function rename
>
> I didn't list change log of rfc stage, see [3] if anyone is interested.
>
>
> [1] https://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd.git
> [2] https://github.com/yiliu1765/qemu/commits/zhenzhong/iommufd_cdev_v1
> [3] https://lists.nongnu.org/archive/html/qemu-devel/2023-07/msg02529.html

Do you have a branch to share?

It does not apply to upstream

Thanks

Eric
>
>
> --------------------------------------------------------------------------
>
> With the introduction of iommufd, the Linux kernel provides a generic
> interface for userspace drivers to propagate their DMA mappings to kernel
> for assigned devices. This series does the porting of the VFIO devices
> onto the /dev/iommu uapi and let it coexist with the legacy implementation.
>
> This QEMU integration is the result of a collaborative work between
> Yi Liu, Yi Sun, Nicolin Chen and Eric Auger.
>
> At QEMU level, interactions with the /dev/iommu are abstracted by a new
> iommufd object (compiled in with the CONFIG_IOMMUFD option).
>
> Any QEMU device (e.g. vfio device) wishing to use /dev/iommu must be
> linked with an iommufd object. In this series, the vfio-pci device is
> granted with such capability (other VFIO devices are not yet ready):
>
> It gets a new optional parameter named iommufd which allows to pass
> an iommufd object:
>
>     -object iommufd,id=iommufd0
>     -device vfio-pci,host=0000:02:00.0,iommufd=iommufd0
>
> Note the /dev/iommu and vfio cdev can be externally opened by a
> management layer. In such a case the fd is passed:
>   
>     -object iommufd,id=iommufd0,fd=22
>     -device vfio-pci,iommufd=iommufd0,fd=23
>
> If the fd parameter is not passed, the fd is opened by QEMU.
> See https://www.mail-archive.com/qemu-devel@nongnu.org/msg937155.html
> for detailed discuss on this requirement.
>
> If no iommufd option is passed to the vfio-pci device, iommufd is not
> used and the end-user gets the behavior based on the legacy vfio iommu
> interfaces:
>
>     -device vfio-pci,host=0000:02:00.0
>
> While the legacy kernel interface is group-centric, the new iommufd
> interface is device-centric, relying on device fd and iommufd.
>
> To support both interfaces in the QEMU VFIO device we reworked the vfio
> container abstraction so that the generic VFIO code can use either
> backend.
>
> The VFIOContainer object becomes a base object derived into
> a) the legacy VFIO container and
> b) the new iommufd based container.
>
> The base object implements generic code such as code related to
> memory_listener and address space management whereas the derived
> objects implement callbacks specific to either BE, legacy and
> iommufd. Indeed each backend has its own way to setup secure context
> and dma management interface. The below diagram shows how it looks
> like with both BEs.
>
>                     VFIO                           AddressSpace/Memory
>     +-------+  +----------+  +-----+  +-----+
>     |  pci  |  | platform |  |  ap |  | ccw |
>     +---+---+  +----+-----+  +--+--+  +--+--+     +----------------------+
>         |           |           |        |        |   AddressSpace       |
>         |           |           |        |        +------------+---------+
>     +---V-----------V-----------V--------V----+               /
>     |           VFIOAddressSpace              | <------------+
>     |                  |                      |  MemoryListener
>     |          VFIOContainer list             |
>     +-------+----------------------------+----+
>             |                            |
>             |                            |
>     +-------V------+            +--------V----------+
>     |   iommufd    |            |    vfio legacy    |
>     |  container   |            |     container     |
>     +-------+------+            +--------+----------+
>             |                            |
>             | /dev/iommu                 | /dev/vfio/vfio
>             | /dev/vfio/devices/vfioX    | /dev/vfio/$group_id
> Userspace   |                            |
> ============+============================+===========================
> Kernel      |  device fd                 |
>             +---------------+            | group/container fd
>             | (BIND_IOMMUFD |            | (SET_CONTAINER/SET_IOMMU)
>             |  ATTACH_IOAS) |            | device fd
>             |               |            |
>             |       +-------V------------V-----------------+
>     iommufd |       |                vfio                  |
> (map/unmap  |       +---------+--------------------+-------+
> ioas_copy)  |                 |                    | map/unmap
>             |                 |                    |
>      +------V------+    +-----V------+      +------V--------+
>      | iommfd core |    |  device    |      |  vfio iommu   |
>      +-------------+    +------------+      +---------------+
>
> [Secure Context setup]
> - iommufd BE: uses device fd and iommufd to setup secure context
>               (bind_iommufd, attach_ioas)
> - vfio legacy BE: uses group fd and container fd to setup secure context
>                   (set_container, set_iommu)
> [Device access]
> - iommufd BE: device fd is opened through /dev/vfio/devices/vfioX
> - vfio legacy BE: device fd is retrieved from group fd ioctl
> [DMA Mapping flow]
> 1. VFIOAddressSpace receives MemoryRegion add/del via MemoryListener
> 2. VFIO populates DMA map/unmap via the container BEs
>    *) iommufd BE: uses iommufd
>    *) vfio legacy BE: uses container fd
>
>
> Thanks,
> Yi, Yi, Eric, Zhenzhong
>
>
> Eric Auger (8):
>   scripts/update-linux-headers: Add iommufd.h
>   vfio/common: Introduce vfio_container_add|del_section_window()
>   vfio/container: Introduce vfio_[attach/detach]_device
>   vfio/platform: Use vfio_[attach/detach]_device
>   vfio/ap: Use vfio_[attach/detach]_device
>   vfio/ccw: Use vfio_[attach/detach]_device
>   backends/iommufd: Introduce the iommufd object
>   vfio/pci: Allow the selection of a given iommu backend
>
> Yi Liu (5):
>   vfio/common: Move IOMMU agnostic helpers to a separate file
>   vfio/common: Move legacy VFIO backend code into separate container.c
>   vfio: Add base container
>   util/char_dev: Add open_cdev()
>   vfio/iommufd: Implement the iommufd backend
>
> Zhenzhong Duan (9):
>   Update linux-header to support iommufd cdev and hwpt alloc
>   vfio/common: Extract out vfio_kvm_device_[add/del]_fd
>   vfio/common: Add a vfio device iterator
>   vfio/common: Refactor vfio_viommu_preset() to be group agnostic
>   vfio/common: Simplify vfio_viommu_preset()
>   Add iommufd configure option
>   vfio/iommufd: Add vfio device iterator callback for iommufd
>   vfio/pci: Adapt vfio pci hot reset support with iommufd BE
>   vfio/pci: Make vfio cdev pre-openable by passing a file handle
>
>  MAINTAINERS                           |   13 +
>  backends/Kconfig                      |    4 +
>  backends/iommufd.c                    |  291 ++++
>  backends/meson.build                  |    3 +
>  backends/trace-events                 |   13 +
>  hw/vfio/ap.c                          |   68 +-
>  hw/vfio/ccw.c                         |  120 +-
>  hw/vfio/common.c                      | 1948 +++----------------------
>  hw/vfio/container-base.c              |  160 ++
>  hw/vfio/container.c                   | 1208 +++++++++++++++
>  hw/vfio/helpers.c                     |  626 ++++++++
>  hw/vfio/iommufd.c                     |  554 +++++++
>  hw/vfio/meson.build                   |    6 +
>  hw/vfio/pci.c                         |  319 +++-
>  hw/vfio/platform.c                    |   43 +-
>  hw/vfio/spapr.c                       |   22 +-
>  hw/vfio/trace-events                  |   21 +-
>  include/hw/vfio/vfio-common.h         |  111 +-
>  include/hw/vfio/vfio-container-base.h |  158 ++
>  include/qemu/char_dev.h               |   16 +
>  include/standard-headers/linux/fuse.h |    3 +
>  include/sysemu/iommufd.h              |   49 +
>  linux-headers/linux/iommufd.h         |  444 ++++++
>  linux-headers/linux/kvm.h             |   13 +-
>  linux-headers/linux/vfio.h            |  148 +-
>  meson.build                           |    6 +
>  meson_options.txt                     |    2 +
>  qapi/qom.json                         |   18 +-
>  qemu-options.hx                       |   13 +
>  scripts/meson-buildoptions.sh         |    3 +
>  scripts/update-linux-headers.sh       |    3 +-
>  util/chardev_open.c                   |   61 +
>  util/meson.build                      |    1 +
>  33 files changed, 4395 insertions(+), 2073 deletions(-)
>  create mode 100644 backends/iommufd.c
>  create mode 100644 hw/vfio/container-base.c
>  create mode 100644 hw/vfio/container.c
>  create mode 100644 hw/vfio/helpers.c
>  create mode 100644 hw/vfio/iommufd.c
>  create mode 100644 include/hw/vfio/vfio-container-base.h
>  create mode 100644 include/qemu/char_dev.h
>  create mode 100644 include/sysemu/iommufd.h
>  create mode 100644 linux-headers/linux/iommufd.h
>  create mode 100644 util/chardev_open.c
>


Reply via email to