Hi All, Previously I used virtio-iommu as a non-s390x test vehicle[0] for the single queue flushing scheme introduced by my s390x DMA API conversion series[1]. For this I modified virtio-iommu to a) use .iotlb_sync_map and b) enable IOMMU_CAP_DEFERRED_FLUSH. It turned out that deferred flush and even just the introduction of ops->iotlb_sync_map yield performance uplift[2] even with per-CPU queues. So here is a small series of these two changes.
The code is also available on the b4/viommu-deferred-flush branch of my kernel.org git repository[3]. Note on testing: I tested this series on my AMD Ryzen 3900X workstation using QEMU 8.1.2 a pass-through NVMe and Intel 82599 NIC VFs. For the NVMe I saw an increase of about 10% in IOPS and 30% in read bandwidth compared with v6.7-rc2. One odd thing though is that QEMU seemed to make the entire guest resident/pinned once I passed-through a PCI device. I seem to remember this wasn't the case with my last version but not sure which QEMU version I used back then. @Jean-Philippe: I didn't include your R-b's as I changed back to the nr_endpoints check and this is like 30% of the patches. Thanks, Niklas [0] https://lore.kernel.org/lkml/20230726111433.1105665-1-schne...@linux.ibm.com/ [1] https://lore.kernel.org/lkml/20230825-dma_iommu-v12-0-413445599...@linux.ibm.com/ [2] https://lore.kernel.org/lkml/20230802123612.GA6142@myrica/ Signed-off-by: Niklas Schnelle <schne...@linux.ibm.com> --- Changes in v3: - Removed NULL check from viommu_sync_req() (Jason) - Went back to checking for 0 endpoints in IOTLB ops (Robin) - Rebased on v6.7-rc2 which includes necessary iommu-dma changes - Link to v2: https://lore.kernel.org/r/20230918-viommu-sync-map-v2-0-f33767f6c...@linux.ibm.com Changes in v2: - Check for viommu == NULL in viommu_sync_req() instead of for 0 endpoints in ops (Jean-Philippe) - Added comment where viommu can be NULL (me) - Link to v1: https://lore.kernel.org/r/20230825-viommu-sync-map-v1-0-56bdcfaa2...@linux.ibm.com To: Jean-Philippe Brucker <jean-phili...@linaro.org> To: Joerg Roedel <j...@8bytes.org> To: Will Deacon <w...@kernel.org> To: Jason Gunthorpe <j...@ziepe.ca> To: Robin Murphy <robin.mur...@arm.com> Cc: virtualization@lists.linux-foundation.org, Cc: io...@lists.linux.dev Cc: linux-ker...@vger.kernel.org, Cc: Niklas Schnelle <schne...@linux.ibm.com> --- Niklas Schnelle (2): iommu/virtio: Make use of ops->iotlb_sync_map iommu/virtio: Add ops->flush_iotlb_all and enable deferred flush drivers/iommu/virtio-iommu.c | 33 ++++++++++++++++++++++++++++++++- 1 file changed, 32 insertions(+), 1 deletion(-) --- base-commit: 98b1cc82c4affc16f5598d4fa14b1858671b2263 change-id: 20230825-viommu-sync-map-1bf0cc4fdc15 Best regards, -- Niklas Schnelle