Hi Zhenzhong On 8/30/23 12:37, Zhenzhong Duan wrote: > Hi All, > > As the kernel side iommufd cdev and hot reset feature have been queued, > also hwpt alloc has been added in Jason's for_next branch [1], I'd like > to update a new version matching kernel side update and with rfc flag > removed. Qemu code can be found at [2], look forward more comments! > > > We have done wide test with different combinations, e.g: > > - PCI device were tested > - FD passing and hot reset with some trick. > - device hotplug test with legacy and iommufd backends > - with or without vIOMMU for legacy and iommufd backends > - divices linked to different iommufds > - VFIO migration with a E800 net card(no dirty sync support) passthrough > - platform, ccw and ap were only compile-tested due to environment limit > > > Given some iommufd kernel limitations, the iommufd backend is > not yet fully on par with the legacy backend w.r.t. features like: > - p2p mappings (you will see related error traces) > - dirty page sync > - and etc. > > > Changelog: > v1: > - Alloc hwpt instead of using auto hwpt > - elaborate iommufd code per Nicolin > - consolidate two patches and drop as.c > - typo error fix and function rename > > I didn't list change log of rfc stage, see [3] if anyone is interested. > > > [1] https://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd.git > [2] https://github.com/yiliu1765/qemu/commits/zhenzhong/iommufd_cdev_v1 > [3] https://lists.nongnu.org/archive/html/qemu-devel/2023-07/msg02529.html
Do you have a branch to share? It does not apply to upstream Thanks Eric > > > -------------------------------------------------------------------------- > > With the introduction of iommufd, the Linux kernel provides a generic > interface for userspace drivers to propagate their DMA mappings to kernel > for assigned devices. This series does the porting of the VFIO devices > onto the /dev/iommu uapi and let it coexist with the legacy implementation. > > This QEMU integration is the result of a collaborative work between > Yi Liu, Yi Sun, Nicolin Chen and Eric Auger. > > At QEMU level, interactions with the /dev/iommu are abstracted by a new > iommufd object (compiled in with the CONFIG_IOMMUFD option). > > Any QEMU device (e.g. vfio device) wishing to use /dev/iommu must be > linked with an iommufd object. In this series, the vfio-pci device is > granted with such capability (other VFIO devices are not yet ready): > > It gets a new optional parameter named iommufd which allows to pass > an iommufd object: > > -object iommufd,id=iommufd0 > -device vfio-pci,host=0000:02:00.0,iommufd=iommufd0 > > Note the /dev/iommu and vfio cdev can be externally opened by a > management layer. In such a case the fd is passed: > > -object iommufd,id=iommufd0,fd=22 > -device vfio-pci,iommufd=iommufd0,fd=23 > > If the fd parameter is not passed, the fd is opened by QEMU. > See https://www.mail-archive.com/qemu-devel@nongnu.org/msg937155.html > for detailed discuss on this requirement. > > If no iommufd option is passed to the vfio-pci device, iommufd is not > used and the end-user gets the behavior based on the legacy vfio iommu > interfaces: > > -device vfio-pci,host=0000:02:00.0 > > While the legacy kernel interface is group-centric, the new iommufd > interface is device-centric, relying on device fd and iommufd. > > To support both interfaces in the QEMU VFIO device we reworked the vfio > container abstraction so that the generic VFIO code can use either > backend. > > The VFIOContainer object becomes a base object derived into > a) the legacy VFIO container and > b) the new iommufd based container. > > The base object implements generic code such as code related to > memory_listener and address space management whereas the derived > objects implement callbacks specific to either BE, legacy and > iommufd. Indeed each backend has its own way to setup secure context > and dma management interface. The below diagram shows how it looks > like with both BEs. > > VFIO AddressSpace/Memory > +-------+ +----------+ +-----+ +-----+ > | pci | | platform | | ap | | ccw | > +---+---+ +----+-----+ +--+--+ +--+--+ +----------------------+ > | | | | | AddressSpace | > | | | | +------------+---------+ > +---V-----------V-----------V--------V----+ / > | VFIOAddressSpace | <------------+ > | | | MemoryListener > | VFIOContainer list | > +-------+----------------------------+----+ > | | > | | > +-------V------+ +--------V----------+ > | iommufd | | vfio legacy | > | container | | container | > +-------+------+ +--------+----------+ > | | > | /dev/iommu | /dev/vfio/vfio > | /dev/vfio/devices/vfioX | /dev/vfio/$group_id > Userspace | | > ============+============================+=========================== > Kernel | device fd | > +---------------+ | group/container fd > | (BIND_IOMMUFD | | (SET_CONTAINER/SET_IOMMU) > | ATTACH_IOAS) | | device fd > | | | > | +-------V------------V-----------------+ > iommufd | | vfio | > (map/unmap | +---------+--------------------+-------+ > ioas_copy) | | | map/unmap > | | | > +------V------+ +-----V------+ +------V--------+ > | iommfd core | | device | | vfio iommu | > +-------------+ +------------+ +---------------+ > > [Secure Context setup] > - iommufd BE: uses device fd and iommufd to setup secure context > (bind_iommufd, attach_ioas) > - vfio legacy BE: uses group fd and container fd to setup secure context > (set_container, set_iommu) > [Device access] > - iommufd BE: device fd is opened through /dev/vfio/devices/vfioX > - vfio legacy BE: device fd is retrieved from group fd ioctl > [DMA Mapping flow] > 1. VFIOAddressSpace receives MemoryRegion add/del via MemoryListener > 2. VFIO populates DMA map/unmap via the container BEs > *) iommufd BE: uses iommufd > *) vfio legacy BE: uses container fd > > > Thanks, > Yi, Yi, Eric, Zhenzhong > > > Eric Auger (8): > scripts/update-linux-headers: Add iommufd.h > vfio/common: Introduce vfio_container_add|del_section_window() > vfio/container: Introduce vfio_[attach/detach]_device > vfio/platform: Use vfio_[attach/detach]_device > vfio/ap: Use vfio_[attach/detach]_device > vfio/ccw: Use vfio_[attach/detach]_device > backends/iommufd: Introduce the iommufd object > vfio/pci: Allow the selection of a given iommu backend > > Yi Liu (5): > vfio/common: Move IOMMU agnostic helpers to a separate file > vfio/common: Move legacy VFIO backend code into separate container.c > vfio: Add base container > util/char_dev: Add open_cdev() > vfio/iommufd: Implement the iommufd backend > > Zhenzhong Duan (9): > Update linux-header to support iommufd cdev and hwpt alloc > vfio/common: Extract out vfio_kvm_device_[add/del]_fd > vfio/common: Add a vfio device iterator > vfio/common: Refactor vfio_viommu_preset() to be group agnostic > vfio/common: Simplify vfio_viommu_preset() > Add iommufd configure option > vfio/iommufd: Add vfio device iterator callback for iommufd > vfio/pci: Adapt vfio pci hot reset support with iommufd BE > vfio/pci: Make vfio cdev pre-openable by passing a file handle > > MAINTAINERS | 13 + > backends/Kconfig | 4 + > backends/iommufd.c | 291 ++++ > backends/meson.build | 3 + > backends/trace-events | 13 + > hw/vfio/ap.c | 68 +- > hw/vfio/ccw.c | 120 +- > hw/vfio/common.c | 1948 +++---------------------- > hw/vfio/container-base.c | 160 ++ > hw/vfio/container.c | 1208 +++++++++++++++ > hw/vfio/helpers.c | 626 ++++++++ > hw/vfio/iommufd.c | 554 +++++++ > hw/vfio/meson.build | 6 + > hw/vfio/pci.c | 319 +++- > hw/vfio/platform.c | 43 +- > hw/vfio/spapr.c | 22 +- > hw/vfio/trace-events | 21 +- > include/hw/vfio/vfio-common.h | 111 +- > include/hw/vfio/vfio-container-base.h | 158 ++ > include/qemu/char_dev.h | 16 + > include/standard-headers/linux/fuse.h | 3 + > include/sysemu/iommufd.h | 49 + > linux-headers/linux/iommufd.h | 444 ++++++ > linux-headers/linux/kvm.h | 13 +- > linux-headers/linux/vfio.h | 148 +- > meson.build | 6 + > meson_options.txt | 2 + > qapi/qom.json | 18 +- > qemu-options.hx | 13 + > scripts/meson-buildoptions.sh | 3 + > scripts/update-linux-headers.sh | 3 +- > util/chardev_open.c | 61 + > util/meson.build | 1 + > 33 files changed, 4395 insertions(+), 2073 deletions(-) > create mode 100644 backends/iommufd.c > create mode 100644 hw/vfio/container-base.c > create mode 100644 hw/vfio/container.c > create mode 100644 hw/vfio/helpers.c > create mode 100644 hw/vfio/iommufd.c > create mode 100644 include/hw/vfio/vfio-container-base.h > create mode 100644 include/qemu/char_dev.h > create mode 100644 include/sysemu/iommufd.h > create mode 100644 linux-headers/linux/iommufd.h > create mode 100644 util/chardev_open.c >