On Wed, Aug 28, 2024 at 07:19:36PM +0100, Lorenzo Stoakes wrote: > On Tue, Aug 27, 2024 at 10:49:06PM GMT, Charlie Jenkins wrote: > > Some applications rely on placing data in free bits addresses allocated > > by mmap. Various architectures (eg. x86, arm64, powerpc) restrict the > > address returned by mmap to be less than the maximum address space, > > unless the hint address is greater than this value. > > > > On arm64 this barrier is at 52 bits and on x86 it is at 56 bits. This > > flag allows applications a way to specify exactly how many bits they > > want to be left unused by mmap. This eliminates the need for > > applications to know the page table hierarchy of the system to be able > > to reason which addresses mmap will be allowed to return. > > > > --- > > riscv made this feature of mmap returning addresses less than the hint > > address the default behavior. This was in contrast to the implementation > > of x86/arm64 that have a single boundary at the 5-level page table > > region. However this restriction proved too great -- the reduced > > address space when using a hint address was too small. > > > > A patch for riscv [1] reverts the behavior that broke userspace. This > > series serves to make this feature available to all architectures. > > I'm a little confused as to the justification for this - you broke RISC V by > doing this, and have now reverted it, but now offer the same behaviour that > broke RISC V to all other architectures? > > I mean this is how this reads, so I might be being ungenerous here :) but > would > be good to clarify what the value-add is here.
Yeah I did not do a good job of explaining this! Having this be the default behavior was broken, not that this feature in general was broken. > > I also wonder at use of a new MAP_ flag, they're a limited resource and we > should only ever add them if we _really_ need to. This seems a bit niche and > specific to be making such a big change for including touching a bunch of > pretty > sensitive arch-specific code. > > We have the ability to change how mmap() functions through 'personalities' > though of course this would impact every mmap() call in the process. > > Overall I'm really not hugely convinced by this, it feels like userland > could find better ways of doing this (mostly you'd do a PROT_NONE mmap() to > reserve a domain and mprotect() it on allocation or mmap() over it). > > So I just struggle to see the purpose myself. BUT absolutely I may be > missing context/others may have a view on the value of this. So happy to > stand corrected. > > > > > I have only tested on riscv and x86. There is a tremendous amount of > > Yeah, OK this is crazy, you can't really submit something as non-RFC that > touches every single arch and not test it. > > I also feel like we need more justification than 'this is a neat thing that > we use in RISC V sometimes' conceptually for such a big change. I will send out a new version that does a much better job at explaining! This is not something that is done on riscv ever currently. This is something that is done on other architectures such as x86 and arm64. This flag is to make similar behavior (the ability to force mmap to return addresses that have a constrained address space) available to all architectures. > > Also your test program is currently completely broken afaict (have > commented on it directly). I also feel like your test program is a little > rudimentary, and should test some edge cases close to the limit etc. > > So I think this is a NACK until there is testing across the board and a little > more justification. > > Feel free to respin, but I think any future revisions should be RFC until > we're absolutely sure on testing/justification. > > I appreciate your efforts here so sorry to be negative, but just obviously > want to make sure this is functional and trades off added complexity for > value for the kernel and userland :) > Totally understand thank you! After reviewing comments I have realized that I made this much more complicated than it needs to be. This should be able to be done without changing any architecture specific code. That will mostly eliminate all of the complexity, but still has the downside of consuming a MAP_ flag. - Charlie > Thanks! > > > duplicated code in mmap so the implementations across architectures I > > believe should be mostly consistent. I added this feature to all > > architectures that implement either > > arch_get_mmap_end()/arch_get_mmap_base() or > > arch_get_unmapped_area_topdown()/arch_get_unmapped_area(). I also added > > it to the default behavior for arch_get_mmap_end()/arch_get_mmap_base(). > > > > Link: > > https://lore.kernel.org/lkml/[email protected]/T/ > > [1] > > > > To: Arnd Bergmann <[email protected]> > > To: Paul Walmsley <[email protected]> > > To: Palmer Dabbelt <[email protected]> > > To: Albert Ou <[email protected]> > > To: Catalin Marinas <[email protected]> > > To: Will Deacon <[email protected]> > > To: Michael Ellerman <[email protected]> > > To: Nicholas Piggin <[email protected]> > > To: Christophe Leroy <[email protected]> > > To: Naveen N Rao <[email protected]> > > To: Muchun Song <[email protected]> > > To: Andrew Morton <[email protected]> > > To: Liam R. Howlett <[email protected]> > > To: Vlastimil Babka <[email protected]> > > To: Lorenzo Stoakes <[email protected]> > > To: Thomas Gleixner <[email protected]> > > To: Ingo Molnar <[email protected]> > > To: Borislav Petkov <[email protected]> > > To: Dave Hansen <[email protected]> > > To: [email protected] > > To: H. Peter Anvin <[email protected]> > > To: Huacai Chen <[email protected]> > > To: WANG Xuerui <[email protected]> > > To: Russell King <[email protected]> > > To: Thomas Bogendoerfer <[email protected]> > > To: James E.J. Bottomley <[email protected]> > > To: Helge Deller <[email protected]> > > To: Alexander Gordeev <[email protected]> > > To: Gerald Schaefer <[email protected]> > > To: Heiko Carstens <[email protected]> > > To: Vasily Gorbik <[email protected]> > > To: Christian Borntraeger <[email protected]> > > To: Sven Schnelle <[email protected]> > > To: Yoshinori Sato <[email protected]> > > To: Rich Felker <[email protected]> > > To: John Paul Adrian Glaubitz <[email protected]> > > To: David S. Miller <[email protected]> > > To: Andreas Larsson <[email protected]> > > To: Shuah Khan <[email protected]> > > To: Alexandre Ghiti <[email protected]> > > Cc: [email protected] > > Cc: [email protected] > > Cc: Palmer Dabbelt <[email protected]> > > Cc: [email protected] > > Cc: [email protected] > > Cc: [email protected] > > Cc: [email protected] > > Cc: [email protected] > > Cc: [email protected] > > Cc: [email protected] > > Cc: [email protected] > > Cc: [email protected] > > Cc: [email protected] > > Cc: [email protected] > > Signed-off-by: Charlie Jenkins <[email protected]> > > > > --- > > Charlie Jenkins (16): > > mm: Add MAP_BELOW_HINT > > riscv: mm: Do not restrict mmap address based on hint > > mm: Add flag and len param to arch_get_mmap_base() > > mm: Add generic MAP_BELOW_HINT > > riscv: mm: Support MAP_BELOW_HINT > > arm64: mm: Support MAP_BELOW_HINT > > powerpc: mm: Support MAP_BELOW_HINT > > x86: mm: Support MAP_BELOW_HINT > > loongarch: mm: Support MAP_BELOW_HINT > > arm: mm: Support MAP_BELOW_HINT > > mips: mm: Support MAP_BELOW_HINT > > parisc: mm: Support MAP_BELOW_HINT > > s390: mm: Support MAP_BELOW_HINT > > sh: mm: Support MAP_BELOW_HINT > > sparc: mm: Support MAP_BELOW_HINT > > selftests/mm: Create MAP_BELOW_HINT test > > > > arch/arm/mm/mmap.c | 10 ++++++++ > > arch/arm64/include/asm/processor.h | 34 > > ++++++++++++++++++++++---- > > arch/loongarch/mm/mmap.c | 11 +++++++++ > > arch/mips/mm/mmap.c | 9 +++++++ > > arch/parisc/include/uapi/asm/mman.h | 1 + > > arch/parisc/kernel/sys_parisc.c | 9 +++++++ > > arch/powerpc/include/asm/task_size_64.h | 36 > > +++++++++++++++++++++++----- > > arch/riscv/include/asm/processor.h | 32 ------------------------- > > arch/s390/mm/mmap.c | 10 ++++++++ > > arch/sh/mm/mmap.c | 10 ++++++++ > > arch/sparc/kernel/sys_sparc_64.c | 8 +++++++ > > arch/x86/kernel/sys_x86_64.c | 25 ++++++++++++++++--- > > fs/hugetlbfs/inode.c | 2 +- > > include/linux/sched/mm.h | 34 > > ++++++++++++++++++++++++-- > > include/uapi/asm-generic/mman-common.h | 1 + > > mm/mmap.c | 2 +- > > tools/arch/parisc/include/uapi/asm/mman.h | 1 + > > tools/include/uapi/asm-generic/mman-common.h | 1 + > > tools/testing/selftests/mm/Makefile | 1 + > > tools/testing/selftests/mm/map_below_hint.c | 29 ++++++++++++++++++++++ > > 20 files changed, 216 insertions(+), 50 deletions(-) > > --- > > base-commit: 5be63fc19fcaa4c236b307420483578a56986a37 > > change-id: 20240827-patches-below_hint_mmap-b13d79ae1c55 > > -- > > - Charlie > > > > > > _______________________________________________ > > linux-riscv mailing list > > [email protected] > > http://lists.infradead.org/mailman/listinfo/linux-riscv
