On Tue, 2017-11-07 at 07:53:07 UTC, Nicholas Piggin wrote: > Currently for radix, flush_tlb_range flushes the entire PID, because > the Linux mm code does not tell us about page size here for THP vs > regular pages. This is quite sub-optimal for small mremap / mprotect > / change_protection. > > So implement va range flushes with two flush passes, one for each > page size (regular and THP). The second flush has an order of matnitude > fewer tlbie instructions than the first, so it is a relatively small > additional cost. > > There is still room for improvement here with some changes to generic > APIs, particularly if there are mostly THP pages to be invalidated, > the small page flushes could be reduced. > > Time to mprotect 1 page of memory (after mmap, touch): > vanilla 2.9us 1.8us > patched 1.2us 1.6us > > Time to mprotect 30 pages of memory (after mmap, touch): > vanilla 8.2us 7.2us > patched 6.9us 17.9us > > Time to mprotect 34 pages of memory (after mmap, touch): > vanilla 9.1us 8.0us > patched 9.0us 8.0us > > 34 pages is the point at which the invalidation switches from va > to entire PID, which tlbie can do in a single instruction. This is > why in the case of 30 pages, the new code runs slower for this test. > This is a deliberate tradeoff already present in the unmap and THP > promotion code, the idea is that the benefit from avoiding flushing > entire TLB for this PID on all threads in the system. > > Signed-off-by: Nicholas Piggin <npig...@gmail.com>
Applied to powerpc next, thanks. https://git.kernel.org/powerpc/c/cbf09c837720f72f5e63ab7a2d331e cheers