On 2025-12-27 8:59 pm, Barry Song wrote:
On Sun, Dec 28, 2025 at 9:16 AM Leon Romanovsky <[email protected]> wrote:
On Sat, Dec 27, 2025 at 11:52:48AM +1300, Barry Song wrote:
From: Barry Song <[email protected]>
Apply batched DMA synchronization to iommu_dma_sync_sg_for_cpu() and
iommu_dma_sync_sg_for_device(). For all buffers in an SG list, only
a single flush operation is needed.
I do not have the hardware to test this, so the patch is marked as
RFC. I would greatly appreciate any testing feedback.
Cc: Leon Romanovsky <[email protected]>
Cc: Marek Szyprowski <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Ada Couprie Diaz <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Cc: Marc Zyngier <[email protected]>
Cc: Anshuman Khandual <[email protected]>
Cc: Ryan Roberts <[email protected]>
Cc: Suren Baghdasaryan <[email protected]>
Cc: Robin Murphy <[email protected]>
Cc: Joerg Roedel <[email protected]>
Cc: Tangquan Zheng <[email protected]>
Signed-off-by: Barry Song <[email protected]>
---
drivers/iommu/dma-iommu.c | 15 +++++++--------
1 file changed, 7 insertions(+), 8 deletions(-)
diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index ffa940bdbbaf..b68dbfcb7846 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -1131,10 +1131,9 @@ void iommu_dma_sync_sg_for_cpu(struct device *dev,
struct scatterlist *sgl,
iommu_dma_sync_single_for_cpu(dev, sg_dma_address(sg),
sg->length, dir);
} else if (!dev_is_dma_coherent(dev)) {
- for_each_sg(sgl, sg, nelems, i) {
+ for_each_sg(sgl, sg, nelems, i)
arch_sync_dma_for_cpu(sg_phys(sg), sg->length, dir);
- arch_sync_dma_flush();
- }
+ arch_sync_dma_flush();
This and previous patches should be squashed into the one which
introduced arch_sync_dma_flush().
Hi Leon,
The series is structured to first introduce no functional change by
replacing all arch_sync_dma_for_* calls with arch_sync_dma_for_* plus
arch_sync_dma_flush(). Subsequent patches then add batching for
different scenarios as separate changes.
Another issue is that I was unable to find a board that both runs
mainline and exercises the IOMMU paths affected by these changes.
As a result, patches 7 and 8 are marked as RFC, while the other
patches have been tested on a real board running mainline + changes.
FWIW if you can get your hands on an M.2 NVMe for the Rock5 then that
has an SMMU in front of PCIe (and could also work to test non-coherent
SWIOTLB, with the SMMU in bypass and either some fake restrictive
dma-ranges in the DT or a hack to reduce the DMA mask in the NVMe driver.)
Cheers,
Robin.