On 3/12/26 19:45, Matt Evans wrote:
> Hi all,
> 
> 
> There were various suggestions in the September 2025 thread "[TECH
> TOPIC] vfio, iommufd: Enabling user space drivers to vend more
> granular access to client processes" [0], and LPC discussions, around
> improving the situation for multi-process userspace driver designs.
> This RFC series implements some of these ideas.
> 
> (Thanks for feedback on v1!  Revised series, with changes noted
> inline.)
> 
> Background: Multi-process USDs
> ==============================
> 
> The userspace driver scenario discussed in that thread involves a
> primary process driving a PCIe function through VFIO/iommufd, which
> manages the function-wide ownership/lifecycle.  The function is
> designed to provide multiple distinct programming interfaces (for
> example, several independent MMIO register frames in one function),
> and the primary process delegates control of these interfaces to
> multiple independent client processes (which do the actual work).
> This scenario clearly relies on a HW design that provides appropriate
> isolation between the programming interfaces.
> 
> The two key needs are:
> 
>  1.  Mechanisms to safely delegate a subset of the device MMIO
>      resources to a client process without over-sharing wider access
>      (or influence over whole-device activities, such as reset).
> 
>  2.  Mechanisms to allow a client process to do its own iommufd
>      management w.r.t. its address space, in a way that's isolated
>      from DMA relating to other clients.
> 
> 
> mmap() of VFIO DMABUFs
> ======================
> 
> This RFC addresses #1 in "vfio/pci: Support mmap() of a VFIO DMABUF",
> implementing the proposals in [0] to add mmap() support to the
> existing VFIO DMABUF exporter.
> 
> This enables a userspace driver to define DMABUF ranges corresponding
> to sub-ranges of a BAR, and grant a given client (via a shared fd)
> the capability to access (only) those sub-ranges.  The VFIO device fds
> would be kept private to the primary process.  All the client can do
> with that fd is map (or iomap via iommufd) that specific subset of
> resources, and the impact of bugs/malice is contained.
> 
>  (We'll follow up on #2 separately, as a related-but-distinct problem.
>   PASIDs are one way to achieve per-client isolation of DMA; another
>   could be sharing of a single IOVA space via 'constrained' iommufds.)
> 
> 
> New in v2: To achieve this, the existing VFIO BAR mmap() path is
> converted to use DMABUFs behind the scenes, in "vfio/pci: Convert BAR
> mmap() to use a DMABUF" plus new helper functions, as Jason/Christian
> suggested in the v1 discussion [3].
> 
> This means:
> 
>  - Both regular and new DMABUF BAR mappings share the same vm_ops,
>    i.e.  mmap()ing DMABUFs is a smaller change on top of the existing
>    mmap().
> 
>  - The zapping of mappings occurs via vfio_pci_dma_buf_move(), and the
>    vfio_pci_zap_bars() originally paired with the _move()s can go
>    away.  Each DMABUF has a unique address_space.
> 
>  - It's a step towards future iommufd VFIO Type1 emulation
>    implementing P2P, since iommufd can now get a DMABUF from a VA that
>    it's mapping for IO; the VMAs' vm_file is that of the backing
>    DMABUF.
> 
> 
> Revocation/reclaim
> ==================
> 
> Mapping a BAR subset is useful, but the lifetime of access granted to
> a client needs to be managed well.  For example, a protocol between
> the primary process and the client can indicate when the client is
> done, and when it's safe to reuse the resources elsewhere, but cleanup
> can't practically be cooperative.
> 
> For robustness, we enable the driver to make the resources
> guaranteed-inaccessible when it chooses, so that it can re-assign them
> to other uses in future.
> 
> "vfio/pci: Permanently revoke a DMABUF on request" adds a new VFIO
> device fd ioctl, VFIO_DEVICE_PCI_DMABUF_REVOKE.  This takes a DMABUF
> fd parameter previously exported (from that device!) and permanently
> revokes the DMABUF.  This notifies/detaches importers, zaps PTEs for
> any mappings, and guarantees no future attachment/import/map/access is
> possible by any means.
> 
> A primary driver process would use this operation when the client's
> tenure ends to reclaim "loaned-out" MMIO interfaces, at which point
> the interfaces could be safely re-used.
> 
> New in v2: ioctl() on VFIO driver fd, rather than DMABUF fd.  A DMABUF
> is revoked using code common to vfio_pci_dma_buf_move(), selectively
> zapping mappings (after waiting for completion on the
> dma_buf_invalidate_mappings() request).
> 
> 
> BAR mapping access attributes
> =============================
> 
> Inspired by Alex [Mastro] and Jason's comments in [0] and Mahmoud's
> work in [1] with the goal of controlling CPU access attributes for
> VFIO BAR mappings (e.g. WC), we can decorate DMABUFs with access
> attributes that are then used by a mapping's PTEs.
> 
> I've proposed reserving a field in struct
> vfio_device_feature_dma_buf's flags to specify an attribute for its
> ranges.  Although that keeps the (UAPI) struct unchanged, it means all
> ranges in a DMABUF share the same attribute.  I feel a single
> attribute-to-mmap() relation is logical/reasonable.  An application
> can also create multiple DMABUFs to describe any BAR layout and mix of
> attributes.
> 
> 
> Tests
> =====
> 
> (Still sharing the [RFC ONLY] userspace test/demo program for context,
> not for merge.)
> 
> It illustrates & tests various map/revoke cases, but doesn't use the
> existing VFIO selftests and relies on a (tweaked) QEMU EDU function.
> I'm (still) working on integrating the scenarios into the existing
> VFIO selftests.
> 
> This code has been tested in mapping DMABUFs of single/multiple
> ranges, aliasing mmap()s, aliasing ranges across DMABUFs, vm_pgoff >
> 0, revocation, shutdown/cleanup scenarios, and hugepage mappings seem
> to work correctly.  I've lightly tested WC mappings also (by observing
> resulting PTEs as having the correct attributes...).
> 
> 
> Fin
> ===
> 
> v2 is based on next-20260310 (to build on Leon's recent series
> "vfio: Wait for dma-buf invalidation to complete" [2]).
> 
> 
> Please share your thoughts!  I'd like to de-RFC if we feel this
> approach is now fair.

I only skimmed over it, but at least of hand I couldn't find anything 
fundamentally wrong.

The locking order seems to change in patch #6. In general I strongly recommend 
to enable lockdep while testing anyway but explicitly when I see such changes.

Additional to that it might also be a good idea to have a lockdep initcall 
function which defines the locking order in the way all the VFIO code should 
follow.

See function dma_resv_lockdep() for an example on how to do that. Especially 
with mmap support and all the locks involved with that it has proven to be a 
good practice to have something like that.

Regards,
Christian.

> 
> 
> Many thanks,
> 
> 
> Matt
> 
> 
> 
> References:
> 
> [0]: 
> https://lore.kernel.org/linux-iommu/[email protected]/
> [1]: https://lore.kernel.org/all/[email protected]/
> [2]: 
> https://lore.kernel.org/linux-iommu/20260205-nocturnal-poetic-chamois-f566ad@houat/T/#m310cd07011e3a1461b6fda45e3f9b886ba76571a
> [3]: https://lore.kernel.org/all/[email protected]/
> 
> --------------------------------------------------------------------------------
> Changelog:
> 
> v2:  Respin based on the feedback/suggestions:
> 
> - Transform the existing VFIO BAR mmap path to also use DMABUFs behind
>   the scenes, and then simply share that code for explicitly-mapped
>   DMABUFs.
> 
> - Refactors the export itself out of vfio_pci_core_feature_dma_buf,
>   and shared by a new vfio_pci_core_mmap_prep_dmabuf helper used by
>   the regular VFIO mmap to create a DMABUF.
> 
> - Revoke buffers using a VFIO device fd ioctl
> 
> v1: https://lore.kernel.org/all/[email protected]/
> 
> 
> Matt Evans (10):
>   vfio/pci: Set up VFIO barmap before creating a DMABUF
>   vfio/pci: Clean up DMABUFs before disabling function
>   vfio/pci: Add helper to look up PFNs for DMABUFs
>   vfio/pci: Add a helper to create a DMABUF for a BAR-map VMA
>   vfio/pci: Convert BAR mmap() to use a DMABUF
>   vfio/pci: Remove vfio_pci_zap_bars()
>   vfio/pci: Support mmap() of a VFIO DMABUF
>   vfio/pci: Permanently revoke a DMABUF on request
>   vfio/pci: Add mmap() attributes to DMABUF feature
>   [RFC ONLY] selftests: vfio: Add standalone vfio_dmabuf_mmap_test
> 
>  drivers/vfio/pci/Kconfig                      |   3 +-
>  drivers/vfio/pci/Makefile                     |   3 +-
>  drivers/vfio/pci/vfio_pci_config.c            |  18 +-
>  drivers/vfio/pci/vfio_pci_core.c              | 123 +--
>  drivers/vfio/pci/vfio_pci_dmabuf.c            | 425 +++++++--
>  drivers/vfio/pci/vfio_pci_priv.h              |  46 +-
>  include/uapi/linux/vfio.h                     |  42 +-
>  tools/testing/selftests/vfio/Makefile         |   1 +
>  .../vfio/standalone/vfio_dmabuf_mmap_test.c   | 837 ++++++++++++++++++
>  9 files changed, 1339 insertions(+), 159 deletions(-)
>  create mode 100644 
> tools/testing/selftests/vfio/standalone/vfio_dmabuf_mmap_test.c
> 

Reply via email to