Hi Andrew,

On 02/05/2025 11:01, Ryan Roberts wrote:
> On 02/05/2025 01:18, Kees Cook wrote:
>> In commit bbdc6076d2e5 ("binfmt_elf: move brk out of mmap when doing
>> direct loader exec"), the brk was moved out of the mmap region when
>> loading static PIE binaries (ET_DYN without INTERP). The common case
>> for these binaries was testing new ELF loaders, so the brk needed to
>> be away from mmap to avoid colliding with stack, future mmaps (of the
>> loader-loaded binary), etc. But this was only done when ASLR was enabled,
>> in an attempt to minimize changes to memory layouts.
>>
>> After adding support to respect alignment requirements for static PIE
>> binaries in commit 3545deff0ec7 ("binfmt_elf: Honor PT_LOAD alignment
>> for static PIE"), it became possible to have a large gap after the
>> final PT_LOAD segment and the top of the mmap region. This means that
>> future mmap allocations might go after the last PT_LOAD segment (where
>> brk might be if ASLR was disabled) instead of before them (where they
>> traditionally ended up).
>>
>> On arm64, running with ASLR disabled, Ubuntu 22.04's "ldconfig" binary,
>> a static PIE, has alignment requirements that leaves a gap large enough
>> after the last PT_LOAD segment to fit the vdso and vvar, but still leave
>> enough space for the brk (which immediately follows the last PT_LOAD
>> segment) to be allocated by the binary.
>>
>> fffff7f20000-fffff7fde000 r-xp 00000000 fe:02 8110426 /sbin/ldconfig.real
>> fffff7fee000-fffff7ff5000 rw-p 000be000 fe:02 8110426 /sbin/ldconfig.real
>> fffff7ff5000-fffff7ffa000 rw-p 00000000 00:00 0
>> ***[brk will go here at fffff7ffa000]***
>> fffff7ffc000-fffff7ffe000 r--p 00000000 00:00 0       [vvar]
>> fffff7ffe000-fffff8000000 r-xp 00000000 00:00 0       [vdso]
>> fffffffdf000-1000000000000 rw-p 00000000 00:00 0      [stack]
>>
>> After commit 0b3bc3354eb9 ("arm64: vdso: Switch to generic storage
>> implementation"), the arm64 vvar grew slightly, and suddenly the brk
>> collided with the allocation.
>>
>> fffff7f20000-fffff7fde000 r-xp 00000000 fe:02 8110426 /sbin/ldconfig.real
>> fffff7fee000-fffff7ff5000 rw-p 000be000 fe:02 8110426 /sbin/ldconfig.real
>> fffff7ff5000-fffff7ffa000 rw-p 00000000 00:00 0
>> ***[oops, no room any more, vvar is at fffff7ffa000!]***
>> fffff7ffa000-fffff7ffe000 r--p 00000000 00:00 0       [vvar]
>> fffff7ffe000-fffff8000000 r-xp 00000000 00:00 0       [vdso]
>> fffffffdf000-1000000000000 rw-p 00000000 00:00 0      [stack]

This change is fixing a pretty serious bug that appeared in v6.15-rc1 so I was
hoping it would make it into the v6.15 final release. I'm guessing mm is the
correct route in? But I don't currently see this in linus's tree or in any of
your mm- staging branches. Is there still time to get this in?

Thanks,
Ryan


>>
>> The solution is to unconditionally move the brk out of the mmap region
>> for static PIE binaries. Whether ASLR is enabled or not does not change if
>> there may be future mmap allocation collisions with a growing brk region.
>>
>> Update memory layout comments (with kernel-doc headings), consolidate
>> the setting of mm->brk to later (it isn't needed early), move static PIE
>> brk out of mmap unconditionally, and make sure brk(2) knows to base brk
>> position off of mm->start_brk not mm->end_data no matter what the cause of
>> moving it is (via current->brk_randomized).
>>
>> For the CONFIG_COMPAT_BRK case, though, leave the logic unchanged, as we
>> can never safely move the brk. These systems, however, are not using
>> specially aligned static PIE binaries.
>>
>> Reported-by: Ryan Roberts <[email protected]>
>> Closes: 
>> https://lore.kernel.org/lkml/[email protected]/
>> Fixes: bbdc6076d2e5 ("binfmt_elf: move brk out of mmap when doing direct 
>> loader exec")
>> Link: https://lore.kernel.org/r/[email protected]
>> Signed-off-by: Kees Cook <[email protected]>
> 
> It's a shame we can't figure out a universal solution that would work for
> CONFIG_COMPAT_BRK too, but I can't think of anything. So as you say, let's 
> worry
> about that in the unlikely event that an issue is reported.
> 
> Reviewed-by: Ryan Roberts <[email protected]>
> Tested-by: Ryan Roberts <[email protected]>
> 
> Thanks for sorting this out! Would be great to get it into v6.15.
> 
>> ---
>>  v2: exclude CONFIG_COMPAT_BRK (ryan)
>>  v1: https://lore.kernel.org/all/[email protected]/
>> ---
>>  fs/binfmt_elf.c | 71 ++++++++++++++++++++++++++++++++-----------------
>>  1 file changed, 47 insertions(+), 24 deletions(-)
>>
>> diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c
>> index 584fa89bc877..4c1ea6b52a53 100644
>> --- a/fs/binfmt_elf.c
>> +++ b/fs/binfmt_elf.c
>> @@ -830,6 +830,7 @@ static int load_elf_binary(struct linux_binprm *bprm)
>>      struct elf_phdr *elf_ppnt, *elf_phdata, *interp_elf_phdata = NULL;
>>      struct elf_phdr *elf_property_phdata = NULL;
>>      unsigned long elf_brk;
>> +    bool brk_moved = false;
>>      int retval, i;
>>      unsigned long elf_entry;
>>      unsigned long e_entry;
>> @@ -1097,15 +1098,19 @@ static int load_elf_binary(struct linux_binprm *bprm)
>>                      /* Calculate any requested alignment. */
>>                      alignment = maximum_alignment(elf_phdata, 
>> elf_ex->e_phnum);
>>  
>> -                    /*
>> -                     * There are effectively two types of ET_DYN
>> -                     * binaries: programs (i.e. PIE: ET_DYN with PT_INTERP)
>> -                     * and loaders (ET_DYN without PT_INTERP, since they
>> -                     * _are_ the ELF interpreter). The loaders must
>> -                     * be loaded away from programs since the program
>> -                     * may otherwise collide with the loader (especially
>> -                     * for ET_EXEC which does not have a randomized
>> -                     * position). For example to handle invocations of
>> +                    /**
>> +                     * DOC: PIE handling
>> +                     *
>> +                     * There are effectively two types of ET_DYN ELF
>> +                     * binaries: programs (i.e. PIE: ET_DYN with
>> +                     * PT_INTERP) and loaders (i.e. static PIE: ET_DYN
>> +                     * without PT_INTERP, usually the ELF interpreter
>> +                     * itself). Loaders must be loaded away from programs
>> +                     * since the program may otherwise collide with the
>> +                     * loader (especially for ET_EXEC which does not have
>> +                     * a randomized position).
>> +                     *
>> +                     * For example, to handle invocations of
>>                       * "./ld.so someprog" to test out a new version of
>>                       * the loader, the subsequent program that the
>>                       * loader loads must avoid the loader itself, so
>> @@ -1118,6 +1123,9 @@ static int load_elf_binary(struct linux_binprm *bprm)
>>                       * ELF_ET_DYN_BASE and loaders are loaded into the
>>                       * independently randomized mmap region (0 load_bias
>>                       * without MAP_FIXED nor MAP_FIXED_NOREPLACE).
>> +                     *
>> +                     * See below for "brk" handling details, which is
>> +                     * also affected by program vs loader and ASLR.
>>                       */
>>                      if (interpreter) {
>>                              /* On ET_DYN with PT_INTERP, we do the ASLR. */
>> @@ -1234,8 +1242,6 @@ static int load_elf_binary(struct linux_binprm *bprm)
>>      start_data += load_bias;
>>      end_data += load_bias;
>>  
>> -    current->mm->start_brk = current->mm->brk = ELF_PAGEALIGN(elf_brk);
>> -
>>      if (interpreter) {
>>              elf_entry = load_elf_interp(interp_elf_ex,
>>                                          interpreter,
>> @@ -1291,27 +1297,44 @@ static int load_elf_binary(struct linux_binprm *bprm)
>>      mm->end_data = end_data;
>>      mm->start_stack = bprm->p;
>>  
>> -    if ((current->flags & PF_RANDOMIZE) && (snapshot_randomize_va_space > 
>> 1)) {
>> +    /**
>> +     * DOC: "brk" handling
>> +     *
>> +     * For architectures with ELF randomization, when executing a
>> +     * loader directly (i.e. static PIE: ET_DYN without PT_INTERP),
>> +     * move the brk area out of the mmap region and into the unused
>> +     * ELF_ET_DYN_BASE region. Since "brk" grows up it may collide
>> +     * early with the stack growing down or other regions being put
>> +     * into the mmap region by the kernel (e.g. vdso).
>> +     *
>> +     * In the CONFIG_COMPAT_BRK case, though, everything is turned
>> +     * off because we're not allowed to move the brk at all.
>> +     */
>> +    if (!IS_ENABLED(CONFIG_COMPAT_BRK) &&
>> +        IS_ENABLED(CONFIG_ARCH_HAS_ELF_RANDOMIZE) &&
>> +        elf_ex->e_type == ET_DYN && !interpreter) {
>> +            elf_brk = ELF_ET_DYN_BASE;
>> +            /* This counts as moving the brk, so let brk(2) know. */
>> +            brk_moved = true;
>> +    }
>> +    mm->start_brk = mm->brk = ELF_PAGEALIGN(elf_brk);
>> +
>> +    if ((current->flags & PF_RANDOMIZE) && snapshot_randomize_va_space > 1) 
>> {
>>              /*
>> -             * For architectures with ELF randomization, when executing
>> -             * a loader directly (i.e. no interpreter listed in ELF
>> -             * headers), move the brk area out of the mmap region
>> -             * (since it grows up, and may collide early with the stack
>> -             * growing down), and into the unused ELF_ET_DYN_BASE region.
>> +             * If we didn't move the brk to ELF_ET_DYN_BASE (above),
>> +             * leave a gap between .bss and brk.
>>               */
>> -            if (IS_ENABLED(CONFIG_ARCH_HAS_ELF_RANDOMIZE) &&
>> -                elf_ex->e_type == ET_DYN && !interpreter) {
>> -                    mm->brk = mm->start_brk = ELF_ET_DYN_BASE;
>> -            } else {
>> -                    /* Otherwise leave a gap between .bss and brk. */
>> +            if (!brk_moved)
>>                      mm->brk = mm->start_brk = mm->brk + PAGE_SIZE;
>> -            }
>>  
>>              mm->brk = mm->start_brk = arch_randomize_brk(mm);
>> +            brk_moved = true;
>> +    }
>> +
>>  #ifdef compat_brk_randomized
>> +    if (brk_moved)
>>              current->brk_randomized = 1;
>>  #endif
>> -    }
>>  
>>      if (current->personality & MMAP_PAGE_ZERO) {
>>              /* Why this, you ask???  Well SVr4 maps page 0 as read-only,
> 


Reply via email to