https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119718
Bug ID: 119718 Summary: __attribute__((musttail)) affects whether -foptimize-tail-calls will in fact optimize a tail call Product: gcc Version: 15.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: lucier at math dot purdue.edu Target Milestone: --- I built the Gambit Scheme system with this compiler: /pkgs/gcc-mainline/bin/gcc -v Using built-in specs. COLLECT_GCC=/pkgs/gcc-mainline/bin/gcc COLLECT_LTO_WRAPPER=/pkgs/gcc-mainline/libexec/gcc/x86_64-pc-linux-gnu/15.0.1/lto-wrapper Target: x86_64-pc-linux-gnu Configured with: ../../gcc-mainline/configure --prefix=/pkgs/gcc-mainline/ --enable-languages=c --enable-checking=release --disable-multilib Thread model: posix Supported LTO compression algorithms: zlib zstd gcc version 15.0.1 20250409 (experimental) (GCC) There are lots of files, this is a typical compile call /pkgs/gcc-mainline/bin/gcc -save-temps -Wno-unused -Wno-write-strings -Wdisabled-optimization -g -ftrapv -fno-strict-aliasing -fno-trapping-math -fno-math-errno -fschedule-insns2 -foptimize-sibling-:qcalls -fipa-ra -fmove-loop-invariants -fPIC -fno-common -mpc64 -I"../include" -c -o _parms.o -I. -DHAVE_CONFIG_H _parms.c -D___LIBRARY So I have the option -foptimize-sibling-calls Many of these files end with a tail call, I'll include _parms.i because it's one of the smaller one. I compiled two versions of _parms.i with the only difference being the addition of __attribute__((musttail)) on one call: heine:~/programs/gambit/gambit/with-musttail> diff _parms.i ../without-musttail/ 14082c14082 < ___ps->pc = ___pc; ___ps->hp=___hp; ___ps->fp=___fp; ___ps->r[0]=___r0; ___ps->r[1]=___r1; ___ps->r[2]=___r2; ___ps->r[3]=___r3; ___ps->r[4]=___r4; } __attribute__((musttail)) return (*((___host*)((((long*)((___pc)-(1))) + (1 +((-2)))))))(___ps); } --- > ___ps->pc = ___pc; ___ps->hp=___hp; ___ps->fp=___fp; ___ps->r[0]=___r0; > ___ps->r[1]=___r1; ___ps->r[2]=___r2; ___ps->r[3]=___r3; ___ps->r[4]=___r4; } > return (*((___host*)((((long*)((___pc)-(1))) + (1 +((-2)))))))(___ps); } Without __attribute__((musttail)), the tail call is not optimized, and with the same thing happening in the other files, eventually the call depth gets too big for the stack and there's a segfault. I can see that in the backtrace from gdb. If I were to try to diagnose it by adding __attribute__((musttail)), I find that the tail call is now optimized, there is no blowing up of the call stack, there is no segfault. If I hadn't looked at the backtrace I wouldn't have been able to know what was going on. I don't think __attribute__((musttail)) should change which tail calls get optimized. File to be added next.