On 2021-03-31 04:00, Isaac J. Manjarres wrote:
When unmapping a buffer from an IOMMU domain, the IOMMU framework unmaps
the buffer at a granule of the largest page size that is supported by
the IOMMU hardware and fits within the buffer. For every block that
is unmapped, the IOMMU framework will call into the IOMMU driver, and
then the io-pgtable framework to walk the page tables to find the entry
that corresponds to the IOVA, and then unmaps the entry.
This can be suboptimal in scenarios where a buffer or a piece of a
buffer can be split into several contiguous page blocks of the same size.
For example, consider an IOMMU that supports 4 KB page blocks, 2 MB page
blocks, and 1 GB page blocks, and a buffer that is 4 MB in size is being
unmapped at IOVA 0. The current call-flow will result in 4 indirect calls,
and 2 page table walks, to unmap 2 entries that are next to each other in
the page-tables, when both entries could have been unmapped in one shot
by clearing both page table entries in the same call.
s/unmap/map/ and s/clear/set/ and those two paragraphs are still just as
valid. I'd say If it's worth doing anything at all then it's worth doing
more than just half the job ;)
These patches implement a callback called unmap_pages to the io-pgtable
code and IOMMU drivers which unmaps an IOVA range that consists of a
number of pages of the same page size that is supported by the IOMMU
hardware, and allows for clearing multiple entries in the same set of
indirect calls. The reason for introducing unmap_pages is to give
other IOMMU drivers/io-pgtable formats time to change to using the new
unmap_pages callback, so that the transition to using this approach can be
done piecemeal.
The same optimization is applicable for mapping buffers, however, the
error handling in the io-pgtable layer couldn't be handled cleanly, as we
would need to invoke iommu_unmap to unmap the parts of the buffer that
were mapped, and then do any TLB maintenance. However, that seemed like a
layering violation.
Why couldn't it just return the partial mapping and let the caller roll
it back?
Note that having a weird asymmetric interface was how things started out
way back when - see bd13969b9524 ("iommu: Split iommu_unmaps") for context.
Any feedback is very much appreciated.
Do you have any real-world performance figures? I proposed this as an
approach because it was clear it could give *some* benefit for
relatively low impact, but I'm curious to find out exactly how much, and
in particular whether it appears to leave anything on the table vs.
punting the entire operation down into the drivers.
Robin.
Thanks,
Isaac
Isaac J. Manjarres (5):
iommu/io-pgtable: Introduce unmap_pages() as a page table op
iommu: Add an unmap_pages() op for IOMMU drivers
iommu: Add support for the unmap_pages IOMMU callback
iommu/io-pgtable-arm: Implement arm_lpae_unmap_pages()
iommu/arm-smmu: Implement the unmap_pages IOMMU driver callback
drivers/iommu/arm/arm-smmu/arm-smmu.c | 19 +++++
drivers/iommu/io-pgtable-arm.c | 114 +++++++++++++++++++++-----
drivers/iommu/iommu.c | 44 ++++++++--
include/linux/io-pgtable.h | 4 +
include/linux/iommu.h | 4 +
5 files changed, 159 insertions(+), 26 deletions(-)
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu