On Tue, Jul 24, 2007 at 03:32:03PM +0530, Naveen H.S. wrote: > > Thanks for your valuable suggestion. > We modified the epilogue as per your suggestions. RTV/N Rn instruction > was generated with the operand as R0 in most of the case. The redundant > transfer of register Rn to R0 before the epilogue is still generated. > So RTV/N does not lead to any optimization in the code size.
With this patch Index: sh.md =================================================================== --- sh.md (revision 126809) +++ sh.md (working copy) @@ -9375,15 +9375,26 @@ "" "sh_expand_prologue (); DONE;") +(define_insn "return_rtv" + [(set (reg:SI R0_REG) + (match_operand:SI 0 "register_operand" "r")) + (return)] + "TARGET_SH2A" + "rtv %0" + [(set_attr "type" "return") + (set_attr "needs_delay_slot" "yes")]) + (define_expand "epilogue" [(return)] "" - " { sh_expand_epilogue (0); - emit_jump_insn (gen_return ()); + if (HAVE_return_rtv) + emit_jump_insn (gen_return_rtv (gen_rtx_REG (SImode, R0_REG))); + else + emit_jump_insn (gen_return ()); DONE; -}") +}) (define_expand "eh_return" [(use (match_operand 0 "register_operand" ""))] and this testcase #include <stdint.h> int32_t rtvtest3 (int64_t a, int64_t b) { return ((a * b) >> 32); } I get (-m2a -O2) _rtvtest3: mov r5,r0 ! 58 movsi_ie/2 [length = 2] mov r6,r3 ! 57 movsi_ie/2 [length = 2] mulr r0,r3 ! 8 mul_r [length = 2] mov r7,r0 ! 59 movsi_ie/2 [length = 2] mulr r0,r4 ! 10 mul_r [length = 2] dmulu.l r5,r7 ! 46 umulsidi3_i [length = 2] mov.l r14,@-r15 ! 60 movsi_ie/10 [length = 4] add r4,r3 ! 11 *addsi3_compact [length = 2] sts mach,r1 ! 48 movsi_ie/7 [length = 2] add r1,r3 ! 13 *addsi3_compact [length = 2] mov r3,r0 ! 28 movsi_ie/2 [length = 2] mov r15,r14 ! 61 movsi_ie/2 [length = 2] mov r14,r15 ! 68 movsi_ie/2 [length = 2] mov.l @r15+,r14 ! 69 movsi_ie/6 [length = 4] rtv r3 ! 70 return_rtv [length = 4] Insn 28 should have been deleted. You'll have to ask the dataflow people why it isn't happening. > We masked this transfer when return type of the function is INTEGER_TYPE > in the function expand_value_return (rtx val) in gcc/stmt.c. This > resulted in some regression FAIL. RTV/N Rn is generated only when > return type of the function is INTEGER_TYPE. How to avoid redundant move > without any regression failures? What "regression failures"? > We tried to get the register Rn from the function expand_value_return > (rtx val) in gcc/stmt.c. The register Rn can be used as the operand in > "return_rtv". The Rn register obtained from the above function is a > PSEUDO register. Kindly suggest a way to get HARD register instead of a > PSEUDO register? true_regnum(), but I doubt that is the way to go. -- Rask Ingemann Lambertsen