On Wed, Mar 15, 2017 at 7:03 PM, Robin Murphy <robin.mur...@arm.com> wrote: > Hi all, > > Here's the first bit of lock contention removal to chew on - feedback > welcome! Note that for the current users of the io-pgtable framework, > this is most likely to simply push more contention onto the io-pgtable > lock, so may not show a great improvement alone. Will and I both have > rough proof-of-concept implementations of lock-free io-pgtable code > which we need to sit down and agree on at some point, hopefullt fairly > soon.
Thanks for working on this. As you said, it's indeed pushing lock contention down to pgtable lock from iova rbtree lock but now morethan lock I see issue is with yielding CPU while waiting for tlb_sync. Below are some numbers. I have tweaked '__arm_smmu_tlb_sync' in SMMUv2 driver i.e basically removed cpu_relax() and udelay() to make it a busy loop. Before: 1.1 Gbps With your patches: 1.45Gbps With your patches + busy loop in tlb_sync: 7Gbps If we reduce pgtable contention a bit With your patches + busy loop in tlb_sync + Iperf threads reduced to 8 from 16: ~9Gbps So looks like along with pgtable lock, some optimization can be done to tlb_sync code as well. Thanks, Sunil. _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu