On Tue Sep 6, 2022 at 4:36 AM AEST, Christophe Leroy wrote: > > > Le 05/09/2022 à 10:50, Nicholas Piggin a écrit : > > The page table fragment allocator is a simple per-mm slab allocator. > > It can be quite wasteful of space for small processes, as well as being > > expensive to initialise. It does not do well at NUMA awareness. > > > > This is a quick hack at addressing some of those problems, but it's not > > complete. It doesn't support THP because it doesn't deal with the page > > table deposit. It has has certain cases where cross-CPU locking could be > > increased (but also a reduction in other cases including reduction on > > ptl). NUMA still has some corner case issues, but it is improved. So > > it's not mergeable yet or necessarily the best way to solve the > > problems. Just a quick hack for some testing. > > > > It save 1-2MB on a simple distro boot on a small (4 CPU) system. The > > powerpc fork selftests benchmark with --fork performance is improved by > > 15% on a POWER9 (14.5k/s -> 17k/s). This is just about a worst-case > > microbenchmark, but would still be good to fix it. > > > > What would really be nice is if we could avoid writing our own allocator > > and use the slab allocator. The problem being we need a page table lock > > spinlock associated with the page table, and that must be able to be > > derived from the page table pointer, and I don't think slab has anything > > that fits the bill. > > I have not looked at it in details yet, but I have the feeling that the > handling of single-fragment architectures have disappeared.
Yes that's gone from my hack, it should be special-cased of course to reduce or avoid unnecessary overhead. Thanks, Nick > > That's commit 2a146533bf96 ("powerpc/mm: Avoid useless lock with single > page fragments"). > > Thanks to that optimisation, all platforms were converted to page > fragments with: > - commit 32ea4c149990 ("powerpc/mm: Extend pte_fragment functionality to > PPC32") > - commit 737b434d3d55 ("powerpc/mm: convert Book3E 64 to pte_fragment") > > > But if the optimisation is removed then I guess the cost will likely be > higher than before. > > Christophe