Delving again into the SPARC back-end to reimplement the -mflat option leads to the discovery of a few thinkos that date back to the conversion to RTL pro- and epilogue:
1. eligible_for_return_delay tests whether registers are saved in the frame. This is unnecessary since the save insns are now emitted in the RTL stream, 2. save_or_restore_regs saves all registers at the same location(!), 3. sparc_can_use_return_insn_p doesn't test whether registers are saved. Of course they have been silent since they were introduced as no registers are ever saved on the SPARC thanks to the register windows, unless you start to fiddle with the -fcall-saved-REG option; apparently nobody does that. 4. We compute a snapshot of current_function_uses_only_leaf_regs during the prologue in order to be able to emit bare returns when possible; the final value is computed much later in the pass pipeline. There is a big comment explaining why this is legitimate. We use the standard formula: sparc_leaf_function_p = optimize > 0 && leaf_function_p () && only_leaf_regs_used (); The thing is, leaf_function_p () always returns 1 in this situation. The hitch is that thread_prologue_and_epilogue_insns starts a RTL sequence for the pro- and epilogue and leaf_function_p () uses get_insns () to scan the insn stream; now get_insns () returns NULL_RTX at the beginning of a sequence... This is apparently a known pitfall, see this comment in config/arm/arm.c: /* We need to know if we are a leaf function. Unfortunately, it is possible to be called after start_sequence has been called, which causes get_insns to return the insns for the sequence, not the function, which will cause leaf_function_p to return the incorrect result. but several other back-ends are affected as well, including the ARM back-end in thumb_force_lr_save. The fix is to use current_function_is_leaf (after register allocation though). Again this has been silent for SPARC because only_leaf_regs_used () "dominates" in the formula, i.e. gives the right answer alone. Bootstrapped/regtested on SPARC/Solaris and SPARC64/Solaris, applied on the mainline and 4.6/4.5/4.4 branches. 2011-05-21 Eric Botcazou <ebotca...@adacore.com> * config/sparc/sparc.c (eligible_for_return_delay): Do not return false if there are call-saved registers here... (sparc_can_use_return_insn_p): ...but here instead. (save_or_restore_regs): Fix thinko. (sparc_expand_prologue): Use current_function_is_leaf. (sparc_frame_pointer_required): Likewise. -- Eric Botcazou
Index: config/sparc/sparc.c =================================================================== --- config/sparc/sparc.c (revision 173918) +++ config/sparc/sparc.c (working copy) @@ -2808,11 +2808,6 @@ eligible_for_return_delay (rtx trial) if (get_attr_length (trial) != 1) return 0; - /* If there are any call-saved registers, we should scan TRIAL if it - does not reference them. For now just make it easy. */ - if (num_gfregs) - return 0; - /* If the function uses __builtin_eh_return, the eh_return machinery occupies the delay slot. */ if (crtl->calls_eh_return) @@ -4400,7 +4395,7 @@ save_or_restore_regs (int low, int high, emit_move_insn (gen_rtx_REG (mode, regno), mem); /* Always preserve double-word alignment. */ - offset = (offset + 7) & -8; + offset = (offset + 8) & -8; } } @@ -4507,7 +4502,7 @@ sparc_expand_prologue (void) example, the regrename pass has special provisions to not rename to non-leaf registers in a leaf function. */ sparc_leaf_function_p - = optimize > 0 && leaf_function_p () && only_leaf_regs_used (); + = optimize > 0 && current_function_is_leaf && only_leaf_regs_used (); /* Need to use actual_fsize, since we are also allocating space for our callee (and our own register save area). */ @@ -4637,6 +4632,7 @@ bool sparc_can_use_return_insn_p (void) { return sparc_prologue_data_valid_p + && num_gfregs == 0 && (actual_fsize == 0 || !sparc_leaf_function_p); } @@ -9796,7 +9792,7 @@ sparc_expand_compare_and_swap_12 (rtx re bool sparc_frame_pointer_required (void) { - return !(leaf_function_p () && only_leaf_regs_used ()); + return !(current_function_is_leaf && only_leaf_regs_used ()); } /* The way this is structured, we can't eliminate SFP in favor of SP