Michael Meissner wrote: > On Mon, Jul 11, 2011 at 12:38:34PM +0200, Georg-Johann Lay wrote: >> How do I write a pre-reload combine + pre-reload split correctly? >> I'd like to avoid clobber reg. >> >> Thanks much for any hint. > > The move patterns are always kind of funny, particularly during register > allocation. > > Lets see given your pattern is: > > (define_insn_and_split "*mulsqihi3.const" > [(set (match_operand:HI 0 "register_operand" "=&r") > (mult:HI (sign_extend:HI (match_operand:QI 1 "register_operand" "a")) > (match_operand:HI 2 "u8_operand" "n")))] > "AVR_HAVE_MUL > && !reload_completed > && !reload_in_progress" > { gcc_unreachable(); } > "&& 1" > [(set (match_dup 3) > (match_dup 2)) > ; *mulsu > (set (match_dup 0) > (mult:HI (sign_extend:HI (match_dup 1)) > (zero_extend:HI (match_dup 3))))] > { > operands[3] = gen_reg_rtx (QImode); > }) > > I would probably rewrite it as: > > (define_insn_and_split "*mulsqihi3.const" > [(set (match_operand:HI 0 "register_operand" "=&r") > (mult:HI (sign_extend:HI (match_operand:QI 1 "register_operand" "a")) > (match_operand:HI 2 "u8_operand" "n")))] > "AVR_HAVE_MUL > && !reload_completed > && !reload_in_progress" > { gcc_unreachable(); } > "&& 1" > [(set (match_dup 3) > (unspec:QI [(match_dup 2)] WRAPPER)) > ; *mulsu > (set (match_dup 0) > (mult:HI (sign_extend:HI (match_dup 1)) > (zero_extend:HI (match_dup 3))))] > { > operands[3] = gen_reg_rtx (QImode); > }) > > (define_insn "*wrapper" > [(set (match_operand:QI 0 "register_operand" "=&r") > (unspec:QI [(match_operand:QI 1 "u8_operand" "n")] WRAPPER))] > "AVR_HAVE_MUL" > "...") > > That way you are using the unspec to make the move not look like a generic > move.
All the trouble arises because there is no straight forward way to write the right insn condition, doesn't it? Working around like that will work but it is obfuscating the code, IMHO. Is there a specific reason for early-clobber ind *wrapper? As *wrapper is not a proper move, could this produce move-move-sequences? These would have to be fixed in peep2 or so. > The other way to do it, would be to split it to another pattern that combines > the move and the HI multiply, which you then split after reload. Something > like: > > (define_insn_and_split "*mulsqihi3_const" > [(set (match_operand:HI 0 "register_operand" "=&r") > (mult:HI (sign_extend:HI (match_operand:QI 1 "register_operand" "a")) > (match_operand:HI 2 "u8_operand" "n")))] > "AVR_HAVE_MUL > && !reload_completed > && !reload_in_progress" > { gcc_unreachable(); } > "&& 1" > [(parallel [(set (match_dup 3) > (match_dup 2)) > ; *mulsu > (set (match_dup 0) > (mult:HI (sign_extend:HI (match_dup 1)) > (zero_extend:HI (match_dup 3))))])] > { > operands[3] = gen_reg_rtx (QImode); > }) > > (define_insn_and_split "*mulsqihi3_const2" > [(set (match_operand:QI 0 "register_operand" "r") > (match_operand:QI 1 "u8_operand" "n")) > (set (match_operand:HI 2 "register_operand" "r") > (mult:HI (sign_extend:HI (match_operand:QI 3 "register_operand" "a")) > (zero_extend:HI (match_dup 0))))] > "AVR_HAV_MUL" > "#" > "&& reload_completed" > [(set (match_dup 0) > (match_dup 1)) > (set (match_dup 2) > (mult:HI (sign_extend:HI (match_dup 3)) > (zero_extend:HI (match_dup 0))))] > {}) The latest patch http://gcc.gnu.org/ml/gcc-patches/2011-07/msg00898.html works around the insn condition shortcoming by writing a gate function. This is the missing part, and if gcc learns something like !ira_in_progress or !split1_completed in the future, the cleanup will be minimal and straight forward. The code is obvious and without obfuscation: (define_insn_and_split "*mulsqihi3.sconst" [(set (match_operand:HI 0 "register_operand" "=r") (mult:HI (sign_extend:HI (match_operand:QI 1 "register_operand" "d")) (match_operand:HI 2 "s8_operand" "n")))] "AVR_HAVE_MUL && avr_gate_split1()" { gcc_unreachable(); } "&& 1" [(set (match_dup 3) (match_dup 2)) ; mulqihi3 (set (match_dup 0) (mult:HI (sign_extend:HI (match_dup 1)) (sign_extend:HI (match_dup 3))))] { operands[2] = GEN_INT (trunc_int_for_mode (INTVAL (operands[2]), QImode)); operands[3] = gen_reg_rtx (QImode); }) /* FIXME: We compose some insns by means of insn combine and split them in split1. We don't want IRA/reload to combine them to the original insns again because that avoid some CSE optimizations if constants are involved. If IRA/reload combines, the recombined insns get split again after reload, but then CSE does not take place. It appears that at present there is no other way to take away the insn from IRA. Notice that split1 runs unconditionally so that all our insns will get split no matter of command line options. */ #include "tree-pass.h" bool avr_gate_split1 (void) { if (current_pass->static_pass_number < pass_match_asm_constraints.pass.static_pass_number) return true; return false; } I choose .asmcons because it runs between IRA and split1, and because I observed that pass numbers are fuzzy; presumably because sub-passes like df etc. Johann