On Tue, Jun 24, 2025 at 12:17:18PM +0200, Neal Gompa wrote:
> On Sun, Jun 22, 2025 at 11:43 AM Florian Weimer <fwei...@redhat.com> wrote:
> >
> > We currently use -mno-omit-leaf-frame-pointer on various architectures.
> > I think we should remove it because it does not work as expected.
> >
> > Obviously, this does not work for many glibc string functions because
> > they use hand-written assembly that does not set up a frame pointer.
> >
> > The happy path of the malloc implementation in Fedora rawhide
> > implementation looks like this:
> >
> > Dump of assembler code for function __GI___libc_malloc:
> >    <+0>:        endbr64
> > # special case for zero size argument
> >    <+4>:        test   %rdi,%rdi
> >    <+7>:        js     0x82af8 <__GI___libc_malloc+264>
> > # compute true allocation size
> >    <+13>:       lea    0x17(%rdi),%rdx
> >    <+17>:       mov    %rdx,%rax
> >    <+20>:       and    $0xfffffffffffffff0,%rax
> >    <+24>:       cmp    $0x1f,%rdx
> >    <+28>:       mov    $0x20,%edx
> >    <+33>:       cmovbe %rdx,%rax
> > # check if size argument is in tcache range
> >    <+37>:       cmp    0x1647d4(%rip),%rax        # 0x1e71f0 <mp_+112>
> >    <+44>:       jae    0x82a80 <__GI___libc_malloc+144>
> > # load tcache pointer
> >    <+46>:       mov    0x16435b(%rip),%rdx        # 0x1e6d80
> >    <+53>:       mov    %fs:(%rdx),%rcx
> > # check for tcache initialization
> >    <+57>:       test   %rcx,%rcx
> >    <+60>:       je     0x82b10 <__GI___libc_malloc+288>
> > # compute tcache bin
> >    <+66>:       mov    %rax,%rdx
> >    <+69>:       shr    $0x4,%rdx
> >    <+73>:       lea    -0x2(%rdx),%rsi
> > # check for basic tcache range
> >    <+77>:       cmp    $0x3f,%rsi
> >    <+81>:       ja     0x82a88 <__GI___libc_malloc+152>
> >    <+83>:       add    $0x10,%rdx
> > # load tcache free list and check if it is empty
> >    <+87>:       mov    0x8(%rcx,%rdx,8),%rax
> >    <+92>:       test   %rax,%rax
> >    <+95>:       je     0x82a80 <__GI___libc_malloc+144>
> > # allocation alignment check
> >    <+97>:       test   $0xf,%al
> >    <+99>:       jne    0x82b40 <__GI___libc_malloc+336>
> > # pointer decoding
> >    <+105>:      mov    %rax,%rdi
> >    <+108>:      shr    $0xc,%rdi
> >    <+112>:      xor    (%rax),%rdi
> > # remove from tcache free list and record that there is more room in it
> >    <+115>:      mov    %rdi,0x8(%rcx,%rdx,8)
> >    <+120>:      addw   $0x1,(%rcx,%rsi,2)
> > # prevent leakage of double-free token in user allocation
> >    <+125>:      movq   $0x0,0x8(%rax)
> >    <+133>:      ret
> >
> > This is compiler-generated code.  The -mno-omit-leaf-frame-pointer flag
> > was used in the build flags as requested:
> >
> > gcc -m32 malloc.c -c -std=gnu11 -fgnu89-inline -O2 -g -Wall
> >   -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1
> >   -fasynchronous-unwind-tables -fstack-clash-protection -Wall
> >   -Wwrite-strings -Wundef -Wimplicit-fallthrough -Werror
> >   -fmerge-all-constants -frounding-math -fstack-protector-strong
> >   -fno-common -U_FORTIFY_SOURCE -D_FORTIFY_SOURCE=2 -Wstrict-prototypes
> >   -Wold-style-definition -Wfree-labels -Wmissing-parameter-name
> >   -fmath-errno -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -fPIC
> >   -DMORECORE_CLEARS=2 -ftls-model=initial-exec -I../include
> >   -I…/build-x86_64-redhat-linux-32/malloc
> >   -I…/build-x86_64-redhat-linux-32
> >   -I../sysdeps/unix/sysv/linux/i386/i686 -I../sysdeps/i386/i686/nptl
> >   -I../sysdeps/unix/sysv/linux/i386
> >   -I../sysdeps/unix/sysv/linux/x86/include
> >   -I../sysdeps/unix/sysv/linux/x86 -I../sysdeps/x86/nptl
> >   -I../sysdeps/i386/nptl -I../sysdeps/unix/sysv/linux/include
> >   -I../sysdeps/unix/sysv/linux -I../sysdeps/nptl -I../sysdeps/pthread
> >   -I../sysdeps/gnu -I../sysdeps/unix/inet -I../sysdeps/unix/sysv
> >   -I../sysdeps/unix/i386 -I../sysdeps/unix -I../sysdeps/posix
> >   -I../sysdeps/i386/i686/fpu/multiarch -I../sysdeps/i386/i686/fpu
> >   -I../sysdeps/i386/i686/multiarch -I../sysdeps/i386/i686
> >   -I../sysdeps/i386/fpu -I../sysdeps/x86/fpu -I../sysdeps/i386
> >   -I../sysdeps/x86/include -I../sysdeps/x86 -I../sysdeps/wordsize-32
> >   -I../sysdeps/ieee754/float128 -I../sysdeps/ieee754/ldbl-96/include
> >   -I../sysdeps/ieee754/ldbl-96 -I../sysdeps/ieee754/dbl-64
> >   -I../sysdeps/ieee754/flt-32 -I../sysdeps/ieee754 -I../sysdeps/generic
> >   -I.. -I../libio -I. -nostdinc
> >   -isystem /usr/lib/gcc/x86_64-redhat-linux/15/include -isystem /usr/include
> >   -D_LIBC_REENTRANT -include …/build-x86_64-redhat-linux-32/libc-modules.h
> >   -DMODULE_NAME=libc -include ../include/libc-symbols.h -DPIC -DSHARED
> >   -DUSE_TCACHE=1 -DTOP_NAMESPACE=glibc
> >   -o …/build-x86_64-redhat-linux-32/malloc/malloc.os
> >   -MD -MP -MF …/build-x86_64-redhat-linux-32/malloc/malloc.os.dt
> >   -MT …/build-x86_64-redhat-linux-32/malloc/malloc.os
> >
> > But clearly it has not the desired effect of setting up a frame pointer.
> > The reason is what GCC calls shrink-wrapping.  It's related to this old
> > optimization:
> >
> >   Minimizing register usage penalty at procedure calls
> >   <https://dl.acm.org/doi/abs/10.1145/53990.53999>
> >
> > Basically it pushes optional function prologue instructions (such as
> > saving callee-saved registers, or setting up the frame pointer) to the
> > code paths that actually need them.
> >
> > Given that most of the hot glibc functions do not use frame pointers (at
> > least not on their happy paths), PLT stubs do not have them, and there
> > is a race condition at the start of functions until the frame pointer is
> > set up, tools need to be aware that the top-most interrupted frame (or
> > the first frame after a signal frame) may use the caller's frame
> > pointer.  This does not prevent frame pointer based unwinding because in
> > the impacted frames, the frame pointer register is not changed.  Without
> > countermeasures, the immediate caller's frame just vanishes from
> > backtraces.
> >
> > Given how pervasive this effect is and that no problems have been
> > reported so far, I think we can drop -mno-omit-leaf-frame-pointer from
> > the build flags.
> >
> > Thoughts?
> >
> 
> So my semi-naive interpretation of the problem here is that glibc is
> special in that it doesn't provide any support for real-time tracing
> or profiling. I'm not sure that justifies removing this flag globally,
> as there are plenty of middleware stacks that do their own thing or
> even things that don't use glibc string functions at all.
> 
> So if you think it makes sense to omit the flag for glibc, sure, but I
> am not convinced it should be done system-wide.

My understanding of Florian's post [and maybe I'm wrong about that]
was that the compiler flag doesn't do what we think it does, and that
problem is more general and affects more than just glibc, anything C
code that is compiled by GCC.  glibc has a separate issue that hot
paths which are written in assembler don't include frame pointers, and
I think we all understand why that is and are fine with it.

Florian, could you clarify if that's right?

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-builder quickly builds VMs from scratch
http://libguestfs.org/virt-builder.1.html

-- 
_______________________________________________
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Do not reply to spam, report it: 
https://pagure.io/fedora-infrastructure/new_issue

Reply via email to