Hi Baolu, On 2021/4/14 15:14, Lu Baolu wrote: > On 4/13/21 4:54 PM, Keqian Zhu wrote: >> Block(largepage) mapping is not a proper granule for dirty log tracking. >> Take an extreme example, if DMA writes one byte, under 1G mapping, the >> dirty amount reported is 1G, but under 4K mapping, the dirty amount is >> just 4K. >> >> This adds a new interface named iommu_split_block in IOMMU base layer. >> A specific IOMMU driver can invoke it during start dirty log. If so, the >> driver also need to realize the split_block iommu ops. >> >> We flush all iotlbs after the whole procedure is completed to ease the >> pressure of IOMMU, as we will hanle a huge range of mapping in general. >> >> Signed-off-by: Keqian Zhu <zhukeqi...@huawei.com> >> Signed-off-by: Kunkun Jiang <jiangkun...@huawei.com> >> --- >> drivers/iommu/iommu.c | 41 +++++++++++++++++++++++++++++++++++++++++ >> include/linux/iommu.h | 11 +++++++++++ >> 2 files changed, 52 insertions(+) >> >> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c >> index 667b2d6d2fc0..bb413a927870 100644 >> --- a/drivers/iommu/iommu.c >> +++ b/drivers/iommu/iommu.c >> @@ -2721,6 +2721,47 @@ int iommu_domain_set_attr(struct iommu_domain *domain, >> } >> EXPORT_SYMBOL_GPL(iommu_domain_set_attr); >> +int iommu_split_block(struct iommu_domain *domain, unsigned long iova, >> + size_t size) >> +{ >> + const struct iommu_ops *ops = domain->ops; >> + unsigned int min_pagesz; >> + size_t pgsize; >> + bool flush = false; >> + int ret = 0; >> + >> + if (unlikely(!ops || !ops->split_block)) >> + return -ENODEV; >> + >> + min_pagesz = 1 << __ffs(domain->pgsize_bitmap); >> + if (!IS_ALIGNED(iova | size, min_pagesz)) { >> + pr_err("unaligned: iova 0x%lx size 0x%zx min_pagesz 0x%x\n", >> + iova, size, min_pagesz); >> + return -EINVAL; >> + } >> + >> + while (size) { >> + flush = true; >> + >> + pgsize = iommu_pgsize(domain, iova, size); >> + >> + ret = ops->split_block(domain, iova, pgsize); >> + if (ret) >> + break; >> + >> + pr_debug("split handled: iova 0x%lx size 0x%zx\n", iova, pgsize); >> + >> + iova += pgsize; >> + size -= pgsize; >> + } >> + >> + if (flush) >> + iommu_flush_iotlb_all(domain); >> + >> + return ret; >> +} >> +EXPORT_SYMBOL_GPL(iommu_split_block); > > Do you really have any consumers of this interface other than the dirty > bit tracking? If not, I don't suggest to make this as a generic IOMMU > interface. > > There is an implicit requirement for such interfaces. The > iommu_map/unmap(iova, size) shouldn't be called at the same time. > Currently there's no such sanity check in the iommu core. A poorly > written driver could mess up the kernel by misusing this interface.
Yes, I don't think up a scenario except dirty tracking. Indeed, we'd better not make them as a generic interface. Do you have any suggestion that underlying iommu drivers can share these code but not make it as a generic iommu interface? I have a not so good idea. Make the "split" interfaces as a static function, and transfer the function pointer to start_dirty_log. But it looks weird and inflexible. On the other hand, if a driver calls map/unmap with split/merge at the same time, it's a bug of driver, it should follow the rule. > > This also applies to iommu_merge_page(). > Thanks, Keqian