This series intends to improve performance and reduce stack consumption in the slice allocation code. It does it by keeping slice masks in the mm_context rather than compute them for each allocation, and by reducing bitmaps and slice_masks from stacks, using pointers instead where possible.
checkstack.pl gives, before: 0x00000de4 slice_get_unmapped_area [slice.o]: 656 0x00001b4c is_hugepage_only_range [slice.o]: 512 0x0000075c slice_find_area_topdown [slice.o]: 416 0x000004c8 slice_find_area_bottomup.isra.1 [slice.o]: 272 0x00001aa0 slice_set_range_psize [slice.o]: 240 0x00000a64 slice_find_area [slice.o]: 176 0x00000174 slice_check_fit [slice.o]: 112 after: 0x00000d70 slice_get_unmapped_area [slice.o]: 320 0x000008f8 slice_find_area [slice.o]: 144 0x00001860 slice_set_range_psize [slice.o]: 144 0x000018ec is_hugepage_only_range [slice.o]: 144 0x00000750 slice_find_area_bottomup.isra.4 [slice.o]: 128 The benchmark in https://github.com/linuxppc/linux/issues/49 gives, before: $ time ./slicemask real 0m20.712s user 0m5.830s sys 0m15.105s after: $ time ./slicemask real 0m13.197s user 0m5.409s sys 0m7.779s Thanks, Nick Nicholas Piggin (5): powerpc/mm/slice: pass pointers to struct slice_mask where possible powerpc/mm/slice: implement a slice mask cache powerpc/mm/slice: implement slice_check_range_fits powerpc/mm/slice: Use const pointers to cached slice masks where possible powerpc/mm/slice: use the dynamic high slice size to limit bitmap operations arch/powerpc/include/asm/book3s/64/mmu.h | 20 +- arch/powerpc/mm/slice.c | 302 +++++++++++++++++++------------ 2 files changed, 204 insertions(+), 118 deletions(-) -- 2.15.1