On Thu, Feb 29, 2024 at 04:26:00AM -0800, H.J. Lu wrote:
> > > Adding Hongtao and Honza into the loop as the ones who acked the original
> > > patch.
> > >
> > > The no_callee_saved_registers by default for noreturn functions change can
> > > break in-process backtrace(3) or backtraces from debugger or other process
> > > (quite often, any time the noreturn function decides to use the bp 
> > > register
> > > and any of the parent frames uses a frame pointer; the unwinder just 
> > > crashes
> > > in the libgcc unwinder case, gdb prints stack corrupted message), so I'd
> > > like to save bp register in that case:
> > >
> > > https://gcc.gnu.org/pipermail/gcc-patches/2024-February/646591.html
> > I think this patch makes sense and LGTM, we save and restore frame
> > pointer for noreturn.
> > >
> > > and additionally the no_callee_saved_registers by default for noreturn
> > > functions change can make debugging harder, again not localized to the
> > > noreturn function, but any of its callers.  So, if say glibc abort 
> > > function
> > > implementation needs a lot of normally callee-saved registers, no matter 
> > > how
> > > users recompile their apps, they will see garbage or optimized out
> > > vars/parameters in their code unless they rebuild their glibc with -O0.
> > > So, I think we should guard that by a non-default option:
> > >
> > > https://gcc.gnu.org/pipermail/gcc-patches/2024-February/646649.html
> > So it turns off the optimization for noreturn functions by default,
> > I'm not sure about this.
> > Any comments, H.J?
> 
> We need BP for backtrace.  I don't think we need to save other
> registers.  True, GDB may not see function parameters.  But
> optimization always has this impact.  When I need to debug a
> program, I always use -O0 or -Og.

The problem is that it doesn't help in this case.
If some optimization makes debugging of some function harder, normally it is
enough to recompile the translation unit that defines it with -O0/-Og, or
add optimize attribute on the function.
While in this case, the optimization interferes with debugging of other
functions, not necessarily from the same translation unit, not necessarily
even from the same library or binary, or even from the same package.
As I tried to explain, supposedly glibc abort is compiled with -O2 and needs
a lot of registers, so say it uses all of %rbx, %rbp, %r12, %r13, %r14,
%r15 and this optimization is applied on it.  That means debugging of any
application (-O2, -Og or even -O0 compiled) to understand what went wrong
and why it aborted will be harder.  Including core file analysis.
Recompiling those apps with -O0/-Og will not help.  The only thing that
would help is to recompile glibc with -O0/-Og.
Doesn't have to be abort, doesn't have to be glibc.  Any library which
exports some noreturn APIs may be affected.
And there is not even a workaround other than to recompile with -O0/-Og the
noreturn functions, no way to disable this optimization.

Given that most users just will not be aware of this, even adding the option
but defaulting to on would mean a problem for a lot of users.  Most of them
will not know the problem is that some noreturn function 10 frames deep in
the call stack was optimized this way.

If people only call the noreturn functions from within the same package,
for some strange reason care about performance of noreturn functions (they
don't return, so unless you longjmp out of them or something similar
which is costly on its own already, they should be entered exactly once)
and are willing to pay the price in worse debugging in that case, let them
use the option.  But if they provide libraries that other packages then
consume, I'd say it wouldn't be a good idea.

        Jakub

Reply via email to