On Fri, 2020-10-23 at 13:57 +0800, chao hao wrote: > On Wed, 2020-10-21 at 17:55 +0100, Robin Murphy wrote: > > On 2020-10-19 12:30, Chao Hao wrote: > > > MTK_IOMMU driver writes one page entry and does tlb flush at a time > > > currently. More optimal would be to aggregate the writes and flush > > > BUS buffer in the end. > > > > That's exactly what iommu_iotlb_gather_add_page() is meant to achieve. > > Rather than jumping straight into hacking up a new API to go round the > > back of the existing API design, it would be far better to ask the > > question of why that's not behaving as expected. > > Thanks for you review! > > iommu_iotlb_gather_add_page is put in io_pgtable_tlb_add_page(). > io_pgtable_tlb_add_page() only be called in > unmapping and mapping flow doesn't have it in linux iommu driver, but > mtk iommu needs to do tlb sync in mapping > and unmapping to avoid old data being in the iommu tlb. > > In addtion, we hope to do tlb sync once when all the pages mapping done. > iommu_iotlb_gather_add_page maybe do > tlb sync more than once. because one whole buffer consists of different > page size(1MB/64K/4K). > > Based on the previous considerations, don't find more appropriate the > way of tlb sync for mtk iommu, so we add a new API. > > > > > > For 50MB buffer mapping, if mtk_iommu driver use iotlb_sync_range() > > > instead of tlb_add_range() and tlb_flush_walk/leaf(), it can increase > > > 50% performance or more(depending on size of every page size) in > > > comparison to flushing after each page entry update. So we prefer to > > > use iotlb_sync_range() to replace iotlb_sync(), tlb_add_range() and > > > tlb_flush_walk/leaf() for MTK platforms. > > > > In the case of mapping, it sounds like what you actually want to do is > > hook up .iotlb_sync_map and generally make IO_PGTABLE_QUIRK_TLBI_ON_MAP > > cleverer, because the current implementation is as dumb as it could > > possibly be. > > iotlb_sync_map only has one parameter(iommu_domain), but mtk > iommu_domain maybe include the whole iova space, if mtk_iommu to do tlb > sync based on iommu_domain, it is equivalent to do tlb flush all in > fact. > iommu driver will do tlb sync in every mapping page when mtk iommu sets > IO_PGTABLE_QUIRK_TLBI_ON_MAP(io_pgtable_tlb_flush_walk), > as is the commit message mentioned, it will drop mapping performance in > mtk platform. > > > > In fact if we simply passed an address range to > > .iotlb_sync_map, io-pgtable probably wouldn't need to be involved at all > > any more. Sorry, I forget to reply the question in previous mail. Do you mean we need to modify iotlb_sync_map() input parameter(ex: add start/end iova)?
> > I know it is not a good idea probably by adding a new api, but I found > out that tlb sync only to be done after mapping one page, so if > mtk_iommu hope to do tlb sync once after all the pages map done, could > you give me some advices? thanks! > > > > > Robin. > > > > > Signed-off-by: Chao Hao <chao....@mediatek.com> > > > --- > > > drivers/iommu/mtk_iommu.c | 6 ++++++ > > > 1 file changed, 6 insertions(+) > > > > > > diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c > > > index 785b228d39a6..d3400c15ff7b 100644 > > > --- a/drivers/iommu/mtk_iommu.c > > > +++ b/drivers/iommu/mtk_iommu.c > > > @@ -224,6 +224,11 @@ static void mtk_iommu_tlb_flush_range_sync(unsigned > > > long iova, size_t size, > > > } > > > } > > > > > > +static void __mtk_iommu_tlb_flush_range_sync(unsigned long iova, size_t > > > size) > > > +{ > > > + mtk_iommu_tlb_flush_range_sync(iova, size, 0, NULL) > > > +} > > > + > > > static void mtk_iommu_tlb_flush_page_nosync(struct iommu_iotlb_gather > > > *gather, > > > unsigned long iova, size_t > > > granule, > > > void *cookie) > > > @@ -536,6 +541,7 @@ static const struct iommu_ops mtk_iommu_ops = { > > > .map = mtk_iommu_map, > > > .unmap = mtk_iommu_unmap, > > > .flush_iotlb_all = mtk_iommu_flush_iotlb_all, > > > + .iotlb_sync_range = __mtk_iommu_tlb_flush_range_sync, > > > .iotlb_sync = mtk_iommu_iotlb_sync, > > > .iova_to_phys = mtk_iommu_iova_to_phys, > > > .probe_device = mtk_iommu_probe_device, > > > >