The current IOVA allocator allocates within the [0x10000, 1ULL << 39] window, without paying attention to the host IOVA reserved regions. This prevents NVMe passthtrough from working on ARM as the fixed IOVAs rapidly grow up to the MSI reserved region [0x8000000, 0x8100000] causing some VFIO MAP DMA failures. This series collects the usable IOVA regions using VFIO GET_INFO (this requires the host to support VFIO_IOMMU_TYPE1_INFO_CAP_IOVA_RANGE) and rework the fixed and temporary IOVA allocators to avoid those latter.
For the time being we do not change the arbitrary min/max IOVAs. In theory they could be dynamically determined but the kernel currently fails to expose some HW limitations described in the ACPI tables (such as PCI root complex Device Memory Address Size Limit). See kernel thread related to "[RFC 0/3] iommu: Reserved regions for IOVAs beyond dma_mask and iommu aperture" for more details: https://lkml.org/lkml/2020/9/28/1102 Best Regards Eric This series can be found at: https://github.com/eauger/qemu/tree/nvme_resv_v2 This was tested on ARM only. History: v1 -> v2: - remove "util/vfio-helpers: Dynamically compute the min/max IOVA" to relax the kernel dependency - Fix cabapbility enumeration loop - set s->usable_iova_ranges=NULL to avoid double free - handle possible u64 wrap Eric Auger (2): util/vfio-helpers: Collect IOVA reserved regions util/vfio-helpers: Rework the IOVA allocator to avoid IOVA reserved regions util/vfio-helpers.c | 129 +++++++++++++++++++++++++++++++++++++++++--- 1 file changed, 123 insertions(+), 6 deletions(-) -- 2.21.3