On Sat, Aug 20, 2011 at 11:31 PM, H.J. Lu <hjl.to...@gmail.com> wrote:
> We can also implement MULX with split: > > (define_split > [(parallel [(set (match_operand:<DWI> 0 "register_operand" "") > (mult:<DWI> > (zero_extend:<DWI> > (match_operand:DWIH 1 "nonimmediate_operand" "")) > (zero_extend:<DWI> > (match_operand:DWIH 2 "nonimmediate_operand" "")))) > (clobber (reg:CC FLAGS_REG))])] > "TARGET_BMI2 > && ix86_binary_operator_ok (MULT, <MODE>mode, operands)" > [(set (match_operand:<DWI> 0 "register_operand" "") > (mult:<DWI> > (zero_extend:<DWI> > (match_operand:DWIH 1 "register_operand" "")) > (zero_extend:<DWI> > (match_operand:DWIH 2 "nonimmediate_operand" ""))))]) Well, this is unconditional splitter, no better than current approach where the pattern is expanded directly. If you want to squeeze out the last 0.005% of performance, you should add BMI alternative to existing umul pattern, leave the choice of alternative to RA and split the exact alternative (that is, you need some true_regnum calls in splitter constraint) after reload to mulx pattern. Please, see new patterns for how this should be done. I'm not against this approach, but after 10 hours of hacking, I just wanted to leave it to an interested reader ;) Uros.