Le 05/09/2022 à 10:50, Nicholas Piggin a écrit : > The page table fragment allocator is a simple per-mm slab allocator. > It can be quite wasteful of space for small processes, as well as being > expensive to initialise. It does not do well at NUMA awareness. > > This is a quick hack at addressing some of those problems, but it's not > complete. It doesn't support THP because it doesn't deal with the page > table deposit. It has has certain cases where cross-CPU locking could be > increased (but also a reduction in other cases including reduction on > ptl). NUMA still has some corner case issues, but it is improved. So > it's not mergeable yet or necessarily the best way to solve the > problems. Just a quick hack for some testing. > > It save 1-2MB on a simple distro boot on a small (4 CPU) system. The > powerpc fork selftests benchmark with --fork performance is improved by > 15% on a POWER9 (14.5k/s -> 17k/s). This is just about a worst-case > microbenchmark, but would still be good to fix it. > > What would really be nice is if we could avoid writing our own allocator > and use the slab allocator. The problem being we need a page table lock > spinlock associated with the page table, and that must be able to be > derived from the page table pointer, and I don't think slab has anything > that fits the bill.
I have not looked at it in details yet, but I have the feeling that the handling of single-fragment architectures have disappeared. That's commit 2a146533bf96 ("powerpc/mm: Avoid useless lock with single page fragments"). Thanks to that optimisation, all platforms were converted to page fragments with: - commit 32ea4c149990 ("powerpc/mm: Extend pte_fragment functionality to PPC32") - commit 737b434d3d55 ("powerpc/mm: convert Book3E 64 to pte_fragment") But if the optimisation is removed then I guess the cost will likely be higher than before. Christophe