https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117455
--- Comment #11 from Iain Sandoe <iains at gcc dot gnu.org> --- (In reply to rguent...@suse.de from comment #10) > On Mon, 11 Nov 2024, iains at gcc dot gnu.org wrote: > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117455 > > > > --- Comment #9 from Iain Sandoe <iains at gcc dot gnu.org> --- > > (In reply to rguent...@suse.de from comment #8) > > > On Sat, 9 Nov 2024, iains at gcc dot gnu.org wrote: > I was thinking of inlining the callers of nested functions from two TUs > into the same function. Dependent of whether the choice of stack vs. > heap trampoline is already reflected into the IL during nested function > lowering or only at RTL expansion time this might or might not be > an "interesting" setup (we can of course chose to not inline functions > with nested function call ABI differences into the same function). I think we're (probably) OK even for this - the nested lowering contains the trampoline model and nested lowering is very early on (007) - long before streaming. The callee ABI is unchanged by the mechanism for the trampoline - so that so long as the STATIC_CHAIN reg is setup properly it should not be able to tell how it was called. the nested lowering looks like: stack: ``` D.2837 = __builtin_dwarf_cfa (0); FRAME.2.FRAME_BASE.PARENT = D.2837; FRAME.2.i = i; __builtin_init_trampoline (&FRAME.2.bar, bar, &FRAME.2); D.2829 = 10; FRAME.2.counter = D.2829; D.2834 = __builtin_adjust_trampoline (&FRAME.2.bar); D.2835 = (void (*<T3e7>) (void)) D.2834; ``` heap: ``` try { D.2836 = __builtin_dwarf_cfa (0); FRAME.2.FRAME_BASE.PARENT = D.2836; FRAME.2.i = i; __builtin___gcc_nested_func_ptr_created (&FRAME.2, bar, &FRAME.2.bar); D.2829 = 10; FRAME.2.counter = D.2829; D.2833 = FRAME.2.bar; D.2834 = (void (*<T3e7>) (void)) D.2833; D.2830 = D.2834; FRAME.2.f = D.2830; D.2831 = FRAME.2.f; D.2831 (); } finally { __builtin___gcc_nested_func_ptr_deleted (); } ``` > That is, I was wondering whether having a per-function > -ftrampoline-impl= flag would work without any other changes I don't see why not - but it's more likely that a user or organisation would set policy wider than that (i.e. I'm not sure what perceived security issue would be solved at a single function boundary) - perhaps just not thinking about it for long enough... > - currently > it's a TU wide flag but we do not enforce it to be the same during > LTO options processing (lto-wrapper.cc:merge_and_complain), so the > "first" TU wins in setting the LTRANS stages flag. Tentatively (absent a way to test this) so long as we continue to link libgcc, and libgcc provides the heap support, I'd say that's probably OK - we do not revisit nested lowering later AFAIK. I suppose that if a platform decided to implement the builtins in a CRT instead of libgcc, then having the right flag on the link line might matter? Iain