On Tue, Aug 14, 2012 at 3:35 PM, Yuri Rumyantsev <ysrum...@gmail.com> wrote: > Uros, > > Let me try to explain you why I used such code duplication: > > Here we have a common case of LEA with 3 different registers - r0 > (target), r1(base), r2(index) and possible offset. > To get the better scheduling we first try to determine what register > is prefirable for inititial setting - r1 or r2 through > find_nearest_reg_def. And then we generate the following sequence of > instructions: > r0 = r_best; > r0 = $const, r0 > r0 = r_worse, r0 > that can save 2 cycles for Atom since first 2 instructions can be hoisted up. > I could not find better way for coding it.
If it is important to put adding of const before adding of the register, then you can emit similar sequence for other cases, too. Something like following: --cut here-- ... { if (regno0 == regno1) tmp = parts.index; else if (regno0 == regno2) tmp = parts.base; else { rtx tmp1; /* regno1: base, regno2: index */ if (find_nearest_reg_def (insn, regno1, regno2)) tmp1 = parts.index, tmp = parts.base; else tmp1 = parts.base, tmp = parts.index; emit_insn (gen_rtx_SET (VOIDmode, target, tmp1)); } if (parts.disp && parts.disp != const0_rtx) ix86_emit_binop (PLUS, mode, target, parts.disp); ix86_emit_binop (PLUS, mode, target, tmp); return; } --cut here-- >>> I prepared new patch and ChangeLog. Testing of x32 is in progress. You didn't fix vertical spaces and tab issues in new patch. Uros.