Re: question about DSE

Alex Turjan Wed, 09 Sep 2009 09:59:55 -0700

Hi Michael,

> My assumption would be these two split loads of HImode are
> generated by your backend from a given SImode MEM.


Indeed your asumption is right. Bellow I have a mulsi3 expand in which I 
generate insns of mode HI. operands[1] gets spilled: in the produced BB as a 
single SI store while in the consumer BB as two separte HI loads (see a_hi and 
a_lo).

(define_expand "mulsi3"
 [(match_operand: SI 0 "general_register_operand" "")
  (match_operand: SI 1 "general_register_operand" "")
  (match_operand: SI 2 "general_register_operand" "")]
 ""
 "{
  rtx buff = gen_reg_rtx(SImode);
  rtx a_lo = gen_rtx_SUBREG(HImode, operands[1], 0);     
  rtx a_hi = gen_rtx_SUBREG(HImode, operands[1], 2);     
  rtx b_lo = gen_rtx_SUBREG(HImode, operands[2], 0);     
  rtx b_hi = gen_rtx_SUBREG(HImode, operands[2], 2);     
  rtx r_hi = gen_rtx_SUBREG(HImode, buff, 2);     
  emit_insn(gen_umulhisi3(buff, a_lo, b_lo));
  emit_insn(gen_machi3(r_hi, a_hi, b_lo, r_hi));
  emit_insn(gen_machi3(r_hi, a_lo, b_hi, r_hi));
  emit_move_insn(operands[0], buff);
  DONE;
}")

> If so, you need
> to make sure to copy the MEM_ALIAS_SET, at least for spill slots (better
> for everything) into the newly generated HImode mems.  For spill slots
> it's not enough to set it to zero.

I get your point but as the generation SI->HI takes place in the expand it 
doesnt help to copy the MEM_ALIAS_SET becasue the operands are pseudo regs. 

However, to get a correct implementation I did the following. Instead of doing 
the split in the expand (as show above), I made use of the following 
define_insn_and_split:

(define_expand "mulsi3"
 [(parallel
[(set (match_operand:SI 0 "register_operand" "")
        (mult:SI (match_operand:SI 1 "register_operand" "")
                  (match_operand:SI 2 "nonmemory_operand" "")))
   (clobber (match_operand:SI  3 "register_operand"   ""))
]
)
]
 ""
 "{
operands[3]  = gen_reg_rtx(SImode);
}")


(define_insn_and_split "*mulsi3"
 [(parallel[(set (match_operand:SI 0 "register_operand" "=d,d")
        (mult:SI (match_operand:SI 1 "register_operand" "d,d")
                  (match_operand:SI 2 "nonmemory_operand" "d,I")))
        (clobber (match_operand:SI  3 "register_operand"   "=d,d"))
  ])]
""
 "#"
 "reload_completed"
  [(clobber (const_int 0))]
"{
  rtx a_lo = gen_rtx_SUBREG(HImode, operands[1], 0);     
  rtx a_hi = gen_rtx_SUBREG(HImode, operands[1], 2);     
  rtx b_lo = gen_rtx_SUBREG(HImode, operands[2], 0);     
  rtx b_hi = gen_rtx_SUBREG(HImode, operands[2], 2);     
  rtx r_hi = gen_rtx_SUBREG(HImode, operands[3], 2);     
  emit_insn(gen_umulhisi3(operands[3], a_lo, b_lo));
  emit_insn(gen_machi3(r_hi, a_hi, b_lo, r_hi));
  emit_insn(gen_machi3(r_hi, a_lo, b_hi, r_hi));
  emit_move_insn(operands[0], operands[3]);
 DONE;
}")

By using this define_insn_and_split with the predicate "reload_completed" 
I ensure that the register allocation takes place on the operands of the 
"mulsi3" instruction as defined by the define_expand construct. In this way 
instead of the two separate HI loads (from my previouse mail) I get only one SI 
load which aliases whith the SI store. In consequence the SI store is no longer 
removed.

1.What do you think about this implementation? using define_insn_and_split
2.Is is true that in the define_expand constructs I should avoid inducing 
subregs?

thanks,
Alex

Re: question about DSE

Reply via email to