This patchset is to improve tlb flushing performance in iommu_map/unmap for MediaTek IOMMU.
For iommu_map, currently MediaTek IOMMU use IO_PGTABLE_QUIRK_TLBI_ON_MAP to do tlb_flush for each a memory chunk. this is so unnecessary. we could improve it by tlb flushing one time at the end of iommu_map. For iommu_unmap, currently we have already improve this performance by gather. But the current gather should take care its granule size. if the granule size is different, it will do tlb flush and gather again. Our HW don't care about granule size. thus I add a flag(granule_ignore) for this case. After this patchset, we could achieve only tlb flushing once in iommu_map and iommu_unmap. Regardless of sg, for each a segment, I did a simple test: size = 20 * SZ_1M; /* the worst case, all are 4k mapping. */ ret = iommu_map(domain, 0x5bb02000, 0x123f1000, size, IOMMU_READ); iommu_unmap(domain, 0x5bb02000, size); This is the comparing time(unit is us): original-time after-improve map-20M 59943 2347 unmap-20M 264 36 This patchset also flush tlb once in the iommu_map_sg case. patch [1/6][2/6][3/6] are for map while the others are for unmap. change note: v2: Refactor all the code. base on v5.10-rc1. v1: https://lore.kernel.org/linux-iommu/20201019113100.23661-1-chao....@mediatek.com/ Yong Wu (6): iommu: Move iotlb_sync_map out from __iommu_map iommu: Add iova and size as parameters in iommu_iotlb_map iommu/mediatek: Add iotlb_sync_map to sync whole the iova range iommu: Add granule_ignore when tlb gather iommu/mediatek: Enable granule_ignore for unmap iommu/mediatek: Convert tlb_flush_walk to gather_add_page drivers/iommu/iommu.c | 24 +++++++++++++++++++----- drivers/iommu/mtk_iommu.c | 32 ++++++++++++++++++++++++++------ drivers/iommu/tegra-gart.c | 3 ++- include/linux/iommu.h | 7 +++++-- 4 files changed, 52 insertions(+), 14 deletions(-) -- 2.18.0