Waiman Long reported 'pgd_lock' contention on high CPU count systems and proposed moving pgd_lock on a separate cacheline to eliminate false sharing and to reduce some of the lock bouncing overhead.
I think we can do much better: this series eliminates the pgd_list and makes pgd_alloc()/pgd_free() lockless. Now the lockless initialization of the PGD has a few preconditions, which the initial part of the series implements: - no PGD clearing is allowed, only additions. This makes sense as a single PGD entry covers 512 GB of RAM so the 4K overhead per 0.5TB of RAM mapped is miniscule. The patches after that convert existing pgd_list users to walk the task list. PGD locking is kept intact: coherency guarantees between the CPA, vmalloc, hotplug, etc. code are unchanged. The final patches eliminate the pgd_list and thus make pgd_alloc()/pgd_free() lockless. The patches have been boot tested on 64-bit and 32-bit x86 systems. Architectures not making use of the new facility are unaffected. Thanks, Ingo ===== Ingo Molnar (12): x86/mm/pat: Don't free PGD entries on memory unmap x86/mm/hotplug: Remove pgd_list use from the memory hotplug code x86/mm/hotplug: Don't remove PGD entries in remove_pagetable() x86/mm/hotplug: Simplify sync_global_pgds() mm: Introduce arch_pgd_init_late() x86/mm: Enable and use the arch_pgd_init_late() method x86/virt/guest/xen: Remove use of pgd_list from the Xen guest code x86/mm: Remove pgd_list use from vmalloc_sync_all() x86/mm/pat/32: Remove pgd_list use from the PAT code x86/mm: Make pgd_alloc()/pgd_free() lockless x86/mm: Remove pgd_list leftovers x86/mm: Simplify pgd_alloc() arch/Kconfig | 9 ++++ arch/x86/Kconfig | 1 + arch/x86/include/asm/pgtable.h | 3 -- arch/x86/include/asm/pgtable_64.h | 3 +- arch/x86/mm/fault.c | 24 +++++++---- arch/x86/mm/init_64.c | 73 +++++++++++---------------------- arch/x86/mm/pageattr.c | 34 ++++++++-------- arch/x86/mm/pgtable.c | 129 +++++++++++++++++++++++++++++----------------------------- arch/x86/xen/mmu.c | 34 +++++++++++++--- fs/exec.c | 3 ++ include/linux/mm.h | 6 +++ kernel/fork.c | 16 ++++++++ 12 files changed, 183 insertions(+), 152 deletions(-) -- 2.1.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/