Richard Henderson schrieb: > On 03/11/2011 05:43 AM, Georg-Johann Lay wrote: >> I did not find a way to make this work together with -mcall-prologues. >> Please let me know if you have suggestion on how call prologues can be >> combine with tail calls. > > You need a new symbol in libgcc for this. It should be easy enough to have > the sibcall epilogue load up Z+EIND before jumping to the new symbol > (perhaps called __sibcall_restores__). This new symbol would be just like > the existing __epilogue_restores__ except that it would finish with an > eijmp/ijmp instruction (depending on multilib) instead of a ret instruction.
The point is that the intent of -mprologue-saves is to save program memory space (flash). If we load the callee's address in the epilogue, the epilogue's size will increase. Moreover, adding an other lengthy function besides __epilogue_restores__ to libgcc, that will increase code size because in many cases we will see both epilogue flavors being dragged into application. Note that, in comparison with non-tailcall, a tailcall without prologue-saves saves one instruction, i.e. the ret. This is no more true in presence of prologue-saves, i.e. even with tail-calling the code would be neither faster nor smaller. (It might still save some stack space, though) >> The implementation uses struct machine_function to pass information >> around, i.e. from avr_function_arg_advance to avr_function_ok_for_sibcall. > > Look at how the s390 port handles this exact problem. > > /* Register 6 on s390 is available as an argument register but unfortunately > "caller saved". This makes functions needing this register for arguments > not suitable for sibcalls. */ > return !s390_call_saved_register_used (exp); > > I'll admit that it would be helpful if the cumulative_args pointer was passed > into the ok_for_sibcall hook, but it's not *that* hard to recreate that value > by hand. This is what the s390_call_saved_register_used function does. See also discussion in http://gcc.gnu.org/ml/gcc/2011-03/msg00048.html I must admit that I did not try to copy-paste all code in calls.c that deals with preparation of arguments and mapping trees to rtx and see how it deflates when applying all the #if and #ifdef and avr backend implications... Maybe we see Joern's work being committed soon so that on top of this a patch adding cumulative args to function_of_for_sibcall will be approved. Presumably, even s390 backend guys would prefer reusing information instead of recomputing them. > >> + || (avr_OS_task_function_p (decl_callee) ^ avr_OS_task_function_p >> (current_function_decl)) > > Please just use != instead of ^ here. Also, needs line wrapping. Done in the patch attached. In the cases of OS_task resp.OS_main, I think I am a bit over-conservative when rejecting tail-calls. This is basically because there is *no* documentation of these attributes. Presumably they implement "naked+ret" so that tail-calling from OS_task resp. OS_main should be ok? > I do like very much how you've cleaned up the call patterns. IMO this should > be committed as a separate patch; I'll let the AVR maintainers approve it > though. If anyone desires an explicit, intermediate step I will provide according patch, of course. New patch attached. Changes with respect to original patch: - use != instead of ^ to ensure caller/callee epilogue compatibility for functions attributed OS_task resp OS_main - add some more comment to md Johann 2011-03-16 Georg-Johann Lay <a...@gjlay.de> * config/avr/avr-protos.h (expand_epilogue): Change prototype * config/avr/avr.h (struct machine_function): Add field sibcall_fails. * config/avr/avr.c (init_cumulative_args, avr_function_arg_advance): Use it. * config/avr/avr.c (expand_epilogue): Add bool parameter. Handle sibcall epilogues. (TARGET_FUNCTION_OK_FOR_SIBCALL): Define to... (avr_function_ok_for_sibcall): ...this new function. (avr_lookup_function_attribute1): New static Function. (avr_naked_function_p, interrupt_function_p, signal_function_p, avr_OS_task_function_p, avr_OS_main_function_p): Use it. * config/avr/avr.md ("sibcall", "sibcall_value", "sibcall_epilogue"): New expander. ("*call_insn", "*call_value_insn"): New insn. ("call_insn", "call_value_insn"): Remove ("call", "call_value", "epilogue"): Change expander to handle sibling calls.
Index: config/avr/avr-protos.h =================================================================== --- config/avr/avr-protos.h (Revision 171039) +++ config/avr/avr-protos.h (Arbeitskopie) @@ -76,7 +76,7 @@ extern const char *lshrsi3_out (rtx insn extern bool avr_rotate_bytes (rtx operands[]); extern void expand_prologue (void); -extern void expand_epilogue (void); +extern void expand_epilogue (bool); extern int avr_epilogue_uses (int regno); extern void avr_output_bld (rtx operands[], int bit_nr); Index: config/avr/avr.md =================================================================== --- config/avr/avr.md (Revision 171039) +++ config/avr/avr.md (Arbeitskopie) @@ -2647,94 +2647,91 @@ ;; call (define_expand "call" - [(call (match_operand:HI 0 "call_insn_operand" "") - (match_operand:HI 1 "general_operand" ""))] + [(parallel[(call (match_operand:HI 0 "call_insn_operand" "") + (match_operand:HI 1 "general_operand" "")) + (use (const_int 0))])] ;; Operand 1 not used on the AVR. + ;; Operand 2 is 1 for tail-call, 0 otherwise. + "" + "") + +(define_expand "sibcall" + [(parallel[(call (match_operand:HI 0 "call_insn_operand" "") + (match_operand:HI 1 "general_operand" "")) + (use (const_int 1))])] + ;; Operand 1 not used on the AVR. + ;; Operand 2 is 1 for tail-call, 0 otherwise. "" "") ;; call value (define_expand "call_value" - [(set (match_operand 0 "register_operand" "") - (call (match_operand:HI 1 "call_insn_operand" "") - (match_operand:HI 2 "general_operand" "")))] + [(parallel[(set (match_operand 0 "register_operand" "") + (call (match_operand:HI 1 "call_insn_operand" "") + (match_operand:HI 2 "general_operand" ""))) + (use (const_int 0))])] ;; Operand 2 not used on the AVR. + ;; Operand 3 is 1 for tail-call, 0 otherwise. "" "") -(define_insn "call_insn" - [(call (mem:HI (match_operand:HI 0 "nonmemory_operand" "!z,*r,s,n")) - (match_operand:HI 1 "general_operand" "X,X,X,X"))] -;; We don't need in saving Z register because r30,r31 is a call used registers +(define_expand "sibcall_value" + [(parallel[(set (match_operand 0 "register_operand" "") + (call (match_operand:HI 1 "call_insn_operand" "") + (match_operand:HI 2 "general_operand" ""))) + (use (const_int 1))])] + ;; Operand 2 not used on the AVR. + ;; Operand 3 is 1 for tail-call, 0 otherwise. + "" + "") + +(define_insn "*call_insn" + [(parallel[(call (mem:HI (match_operand:HI 0 "nonmemory_operand" "z,s,z,s")) + (match_operand:HI 1 "general_operand" "X,X,X,X")) + (use (match_operand:HI 2 "const_int_operand" "L,L,P,P"))])] ;; Operand 1 not used on the AVR. - "(register_operand (operands[0], HImode) || CONSTANT_P (operands[0]))" - "*{ - if (which_alternative==0) - return \"%!icall\"; - else if (which_alternative==1) - { - if (AVR_HAVE_MOVW) - return (AS2 (movw, r30, %0) CR_TAB - \"%!icall\"); - else - return (AS2 (mov, r30, %A0) CR_TAB - AS2 (mov, r31, %B0) CR_TAB - \"%!icall\"); - } - else if (which_alternative==2) - return AS1(%~call,%x0); - return (AS2 (ldi,r30,lo8(%0)) CR_TAB - AS2 (ldi,r31,hi8(%0)) CR_TAB - \"%!icall\"); -}" - [(set_attr "cc" "clobber,clobber,clobber,clobber") + ;; Operand 2 is 1 for tail-call, 0 otherwise. + "" + "@ + %!icall + %~call %x0 + %!ijmp + %~jmp %x0" + [(set_attr "cc" "clobber") (set_attr_alternative "length" - [(const_int 1) - (if_then_else (eq_attr "mcu_have_movw" "yes") - (const_int 2) - (const_int 3)) - (if_then_else (eq_attr "mcu_mega" "yes") - (const_int 2) - (const_int 1)) - (const_int 3)])]) - -(define_insn "call_value_insn" - [(set (match_operand 0 "register_operand" "=r,r,r,r") - (call (mem:HI (match_operand:HI 1 "nonmemory_operand" "!z,*r,s,n")) -;; We don't need in saving Z register because r30,r31 is a call used registers - (match_operand:HI 2 "general_operand" "X,X,X,X")))] + [(const_int 1) + (if_then_else (eq_attr "mcu_mega" "yes") + (const_int 2) + (const_int 1)) + (const_int 1) + (if_then_else (eq_attr "mcu_mega" "yes") + (const_int 2) + (const_int 1))])]) + +(define_insn "*call_value_insn" + [(parallel[(set (match_operand 0 "register_operand" "=r,r,r,r") + (call (mem:HI (match_operand:HI 1 "nonmemory_operand" "z,s,z,s")) + (match_operand:HI 2 "general_operand" "X,X,X,X"))) + (use (match_operand:HI 3 "const_int_operand" "L,L,P,P"))])] ;; Operand 2 not used on the AVR. - "(register_operand (operands[0], VOIDmode) || CONSTANT_P (operands[0]))" - "*{ - if (which_alternative==0) - return \"%!icall\"; - else if (which_alternative==1) - { - if (AVR_HAVE_MOVW) - return (AS2 (movw, r30, %1) CR_TAB - \"%!icall\"); - else - return (AS2 (mov, r30, %A1) CR_TAB - AS2 (mov, r31, %B1) CR_TAB - \"%!icall\"); - } - else if (which_alternative==2) - return AS1(%~call,%x1); - return (AS2 (ldi, r30, lo8(%1)) CR_TAB - AS2 (ldi, r31, hi8(%1)) CR_TAB - \"%!icall\"); -}" - [(set_attr "cc" "clobber,clobber,clobber,clobber") + ;; Operand 3 is 1 for tail-call, 0 otherwise. + "" + "@ + %!icall + %~call %x1 + %!ijmp + %~jmp %x1" + [(set_attr "cc" "clobber") (set_attr_alternative "length" - [(const_int 1) - (if_then_else (eq_attr "mcu_have_movw" "yes") - (const_int 2) - (const_int 3)) - (if_then_else (eq_attr "mcu_mega" "yes") - (const_int 2) - (const_int 1)) - (const_int 3)])]) + [(const_int 1) + (if_then_else (eq_attr "mcu_mega" "yes") + (const_int 2) + (const_int 1)) + (const_int 1) + (if_then_else (eq_attr "mcu_mega" "yes") + (const_int 2) + (const_int 1))])]) (define_insn "nop" [(const_int 0)] @@ -3246,8 +3243,15 @@ (define_expand "epilogue" [(const_int 0)] "" - " { - expand_epilogue (); + expand_epilogue (false /* sibcall_p */); DONE; - }") + }) + +(define_expand "sibcall_epilogue" + [(const_int 0)] + "" + { + expand_epilogue (true /* sibcall_p */); + DONE; + }) Index: config/avr/avr.c =================================================================== --- config/avr/avr.c (Revision 171039) +++ config/avr/avr.c (Arbeitskopie) @@ -100,6 +100,7 @@ static rtx avr_function_arg (CUMULATIVE_ static void avr_function_arg_advance (CUMULATIVE_ARGS *, enum machine_mode, const_tree, bool); static void avr_help (void); +static bool avr_function_ok_for_sibcall (tree, tree); /* Allocate registers from r25 to r8 for parameters for function calls. */ #define FIRST_CUM_REG 26 @@ -231,6 +232,9 @@ static const struct default_options avr_ #undef TARGET_HELP #define TARGET_HELP avr_help +#undef TARGET_FUNCTION_OK_FOR_SIBCALL +#define TARGET_FUNCTION_OK_FOR_SIBCALL avr_function_ok_for_sibcall + struct gcc_target targetm = TARGET_INITIALIZER; static void @@ -330,17 +334,34 @@ avr_regno_reg_class (int r) return ALL_REGS; } +/* A helper for the subsequent function attribute used to dig for + attribute 'name' in a FUNCTION_DECL or FUNCTION_TYPE */ + +static inline int +avr_lookup_function_attribute1 (const_tree func, const char *name) +{ + if (FUNCTION_DECL == TREE_CODE (func)) + { + if (NULL_TREE != lookup_attribute (name, DECL_ATTRIBUTES (func))) + { + return true; + } + + func = TREE_TYPE (func); + } + + gcc_assert (TREE_CODE (func) == FUNCTION_TYPE + || TREE_CODE (func) == METHOD_TYPE); + + return NULL_TREE != lookup_attribute (name, TYPE_ATTRIBUTES (func)); +} + /* Return nonzero if FUNC is a naked function. */ static int avr_naked_function_p (tree func) { - tree a; - - gcc_assert (TREE_CODE (func) == FUNCTION_DECL); - - a = lookup_attribute ("naked", TYPE_ATTRIBUTES (TREE_TYPE (func))); - return a != NULL_TREE; + return avr_lookup_function_attribute1 (func, "naked"); } /* Return nonzero if FUNC is an interrupt function as specified @@ -349,13 +370,7 @@ avr_naked_function_p (tree func) static int interrupt_function_p (tree func) { - tree a; - - if (TREE_CODE (func) != FUNCTION_DECL) - return 0; - - a = lookup_attribute ("interrupt", DECL_ATTRIBUTES (func)); - return a != NULL_TREE; + return avr_lookup_function_attribute1 (func, "interrupt"); } /* Return nonzero if FUNC is a signal function as specified @@ -364,13 +379,7 @@ interrupt_function_p (tree func) static int signal_function_p (tree func) { - tree a; - - if (TREE_CODE (func) != FUNCTION_DECL) - return 0; - - a = lookup_attribute ("signal", DECL_ATTRIBUTES (func)); - return a != NULL_TREE; + return avr_lookup_function_attribute1 (func, "signal"); } /* Return nonzero if FUNC is a OS_task function. */ @@ -378,12 +387,7 @@ signal_function_p (tree func) static int avr_OS_task_function_p (tree func) { - tree a; - - gcc_assert (TREE_CODE (func) == FUNCTION_DECL); - - a = lookup_attribute ("OS_task", TYPE_ATTRIBUTES (TREE_TYPE (func))); - return a != NULL_TREE; + return avr_lookup_function_attribute1 (func, "OS_task"); } /* Return nonzero if FUNC is a OS_main function. */ @@ -391,12 +395,7 @@ avr_OS_task_function_p (tree func) static int avr_OS_main_function_p (tree func) { - tree a; - - gcc_assert (TREE_CODE (func) == FUNCTION_DECL); - - a = lookup_attribute ("OS_main", TYPE_ATTRIBUTES (TREE_TYPE (func))); - return a != NULL_TREE; + return avr_lookup_function_attribute1 (func, "OS_main"); } /* Return the number of hard registers to push/pop in the prologue/epilogue @@ -864,7 +863,7 @@ avr_epilogue_uses (int regno ATTRIBUTE_U /* Output RTL epilogue. */ void -expand_epilogue (void) +expand_epilogue (bool sibcall_p) { int reg; int live_seq; @@ -875,6 +874,8 @@ expand_epilogue (void) /* epilogue: naked */ if (cfun->machine->is_naked) { + gcc_assert (!sibcall_p); + emit_jump_insn (gen_return ()); return; } @@ -1016,7 +1017,8 @@ expand_epilogue (void) emit_insn (gen_popqi (zero_reg_rtx)); } - emit_jump_insn (gen_return ()); + if (!sibcall_p) + emit_jump_insn (gen_return ()); } } @@ -1629,6 +1631,10 @@ init_cumulative_args (CUMULATIVE_ARGS *c cum->regno = FIRST_CUM_REG; if (!libname && stdarg_p (fntype)) cum->nregs = 0; + + /* Assume the calle may be tail called */ + + cfun->machine->sibcall_fails = 0; } /* Returns the number of registers to allocate for a function argument. */ @@ -1676,6 +1682,23 @@ avr_function_arg_advance (CUMULATIVE_ARG cum->nregs -= bytes; cum->regno -= bytes; + /* A parameter is being passed in a call-saved register. As the original + contents of these regs has to be restored before leaving the function, + a function must not pass arguments in call-saved regs in order to get + tail-called. */ + + if (cum->regno >= 0 + && !call_used_regs[cum->regno]) + { + /* FIXME: We ship info on failing tail-call in struct machine_function. + This uses internals of calls.c:expand_call() and the way args_so_far + is used. targetm.function_ok_for_sibcall() needs to be extended to + pass &args_so_far, too. At present, CUMULATIVE_ARGS is target + dependent so that such an extension is not wanted. */ + + cfun->machine->sibcall_fails = 1; + } + if (cum->nregs <= 0) { cum->nregs = 0; @@ -1683,6 +1706,62 @@ avr_function_arg_advance (CUMULATIVE_ARG } } +/* Implement `TARGET_FUNCTION_OK_FOR_SIBCALL' */ +/* Decide whether we can make a sibling call to a function. DECL is the + declaration of the function being targeted by the call and EXP is the + CALL_EXPR representing the call. */ + +static bool +avr_function_ok_for_sibcall (tree decl_callee, tree exp_callee) +{ + tree fntype_callee; + + /* Tail-calling must fail if callee-saved regs are used to pass + function args. We must not tail-call when `epilogue_restores' + is used. Unfortunately, we cannot tell at this point if that + actually will happen or not, and we cannot step back from + tail-calling. Thus, we inhibit tail-calling with -mcall-prologues. */ + + if (cfun->machine->sibcall_fails + || TARGET_CALL_PROLOGUES) + { + return false; + } + + fntype_callee = TREE_TYPE (CALL_EXPR_FN (exp_callee)); + + if (decl_callee) + { + decl_callee = TREE_TYPE (decl_callee); + } + else + { + decl_callee = fntype_callee; + + while (FUNCTION_TYPE != TREE_CODE (decl_callee) + && METHOD_TYPE != TREE_CODE (decl_callee)) + { + decl_callee = TREE_TYPE (decl_callee); + } + } + + /* Ensure that caller and callee have compatible epilogues */ + + if (interrupt_function_p (current_function_decl) + || signal_function_p (current_function_decl) + || avr_naked_function_p (decl_callee) + || avr_naked_function_p (current_function_decl) + || (avr_OS_task_function_p (decl_callee) + != avr_OS_task_function_p (current_function_decl)) + || (avr_OS_main_function_p (decl_callee) + != avr_OS_main_function_p (current_function_decl))) + { + return false; + } + + return true; +} + /*********************************************************************** Functions for outputting various mov's for a various modes ************************************************************************/ Index: config/avr/avr.opt =================================================================== --- config/avr/avr.opt (Revision 171039) +++ config/avr/avr.opt (Arbeitskopie) @@ -29,6 +29,9 @@ Target RejectNegative Joined Var(avr_mcu mdeb Target Report Undocumented Mask(ALL_DEBUG) +mtest= +Target RejectNegative Report Joined Undocumented UInteger Var(avr_test) Init(0) + mint8 Target Report Mask(INT8) Use an 8-bit 'int' type Index: config/avr/avr.h =================================================================== --- config/avr/avr.h (Revision 171039) +++ config/avr/avr.h (Arbeitskopie) @@ -828,4 +828,7 @@ struct GTY(()) machine_function /* Current function stack size. */ int stack_usage; + + /* 'true' if a callee might be tail called */ + int sibcall_fails; };