Jakub Jelinek <ja...@redhat.com> writes: > Hi! > > My recent patch for tail call improvement apparently affects also the > _Unwind_Resume_or_Rethrow function in libgcc: > > _Unwind_Reason_Code __attribute__((optimize ("no-omit-frame-pointer"))) > _Unwind_Resume_or_Rethrow (struct _Unwind_Exception *exc) > { > struct _Unwind_Context this_context, cur_context; > _Unwind_Reason_Code code; > unsigned long frames; > if (exc->private_1 == 0) > return _Unwind_RaiseException (exc); > do { __builtin_unwind_init (); uw_init_context_1 (&this_context, > __builtin_dwarf_cfa (), __builtin_return_address (0)); } while (0); > cur_context = this_context; > code = _Unwind_ForcedUnwind_Phase2 (exc, &cur_context, &frames); > ((void)(!(code == _URC_INSTALL_CONTEXT) ? abort (), 0 : 0)); > do { long offset = uw_install_context_1 ((&this_context), (&cur_context)); > void *handler = uw_frob_return_addr ((&this_context), (&cur_context)); > _Unwind_DebugHook ((&cur_context)->cfa, handler); ; __builtin_eh_return > (offset, handler); } while (0); > } > > Previously, the mere existence of the addressable variables this_context > and cur_context prevented tail call on the early out > return _Unwind_RaiseException (exc); > but since r271013 the tailcall analysis figures that while those two > variables are there, they aren't touched before the possible tail call > site, so they can't be really live during the call. > The problem is that this is one of the few calls that call > __builtin_eh_return which normally user code doesn't use, we use it just > for the unwinder routines and so some targets (in this case aarch64, in > another report powerpc-darwin) show that the combination of > cfun->calls_eh_return + tail calls has not been really tested. > One possibility is to do if (cfun->calls_eh_return) return false; in > the target hook *_ok_for_sibcall, the other possibility is to fix that > case properly. > > The following patch fixes the aarch64 case. > The code emitted for the code path from the start of the function > till the tail call was: > 0000000000002778 <_Unwind_Resume_or_Rethrow>: > 2778: d12143ff sub sp, sp, #0x850 > 277c: a9007bfd stp x29, x30, [sp] > 2780: 910003fd mov x29, sp > 2784: a90107e0 stp x0, x1, [sp, #16] > 2788: f9400801 ldr x1, [x0, #16] > 278c: a9020fe2 stp x2, x3, [sp, #32] > 2790: a90353f3 stp x19, x20, [sp, #48] > 2794: aa0003f3 mov x19, x0 > 2798: a9045bf5 stp x21, x22, [sp, #64] > 279c: a90563f7 stp x23, x24, [sp, #80] > 27a0: a9066bf9 stp x25, x26, [sp, #96] > 27a4: a90773fb stp x27, x28, [sp, #112] > 27a8: 6d0827e8 stp d8, d9, [sp, #128] > 27ac: 6d092fea stp d10, d11, [sp, #144] > 27b0: 6d0a37ec stp d12, d13, [sp, #160] > 27b4: 6d0b3fee stp d14, d15, [sp, #176] > 27b8: b5000201 cbnz x1, 27f8 > <_Unwind_Resume_or_Rethrow+0x80> > 27bc: a9407bfd ldp x29, x30, [sp] > 27c0: a94107e0 ldp x0, x1, [sp, #16] > 27c4: a9420fe2 ldp x2, x3, [sp, #32] > 27c8: a94353f3 ldp x19, x20, [sp, #48] > 27cc: a9445bf5 ldp x21, x22, [sp, #64] > 27d0: a94563f7 ldp x23, x24, [sp, #80] > 27d4: a9466bf9 ldp x25, x26, [sp, #96] > 27d8: a94773fb ldp x27, x28, [sp, #112] > 27dc: 6d4827e8 ldp d8, d9, [sp, #128] > 27e0: 6d492fea ldp d10, d11, [sp, #144] > 27e4: 6d4a37ec ldp d12, d13, [sp, #160] > 27e8: 6d4b3fee ldp d14, d15, [sp, #176] > 27ec: 912143ff add sp, sp, #0x850 > 27f0: 8b2463ff add sp, sp, x4 > 27f4: 14000000 b 23c8 <_Unwind_RaiseException> > 27f4: R_AARCH64_JUMP26 _Unwind_RaiseException > This does a lot of register saving and restoring, which is not needed but is > not wrong-code (guess separate shrink wrapping would help here if > implemented for the target). The only wrong-code is actually the > add sp, sp, x4 instruction though. The previous instruction restored sp to > the value it had at the start of the function and then we should just tail > call. This instruction is something that is needed in the spot where > __builtin_eh_return is emitted. > > Fixed thusly, bootstrapped/regtested on aarch64-linux, ok for trunk? > > 2019-05-10 Jakub Jelinek <ja...@redhat.com> > > PR c++/59813 > * config/aarch64/aarch64.c (aarch64_expand_epilogue): Don't add > EH_RETURN_STACKADJ_RTX to sp in sibcall epilogues.
OK, thanks. FWIW, I agree we should just fix the targets and not even use the workaround you posted later. It won't be the first time that many targets have got something wrong due to lack of coverage. Richard > > --- gcc/config/aarch64/aarch64.c.jj 2019-05-02 12:18:40.004979690 +0200 > +++ gcc/config/aarch64/aarch64.c 2019-05-09 20:08:00.774718003 +0200 > @@ -5913,7 +5913,7 @@ aarch64_expand_epilogue (bool for_sibcal > } > > /* Stack adjustment for exception handler. */ > - if (crtl->calls_eh_return) > + if (crtl->calls_eh_return && !for_sibcall) > { > /* We need to unwind the stack by the offset computed by > EH_RETURN_STACKADJ_RTX. We have already reset the CFA > > Jakub