On Fri, Aug 13, 2021 at 2:48 AM Hongyu Wang <hongyu.w...@intel.com> wrote: > > Hi, > > For lea + zero_extendsidi insns, if dest of lea and src of zext are the > same, combine them with single leal under 64bit target since 32bit > register will be automatically zero-extended. > > Bootstrapped and regtested on x86_64-linux-gnu{-m32,}. > Ok for master? > > gcc/ChangeLog: > > PR target/101716 > * config/i386/i386.md (*lea<mode>_zext): New define_insn. > (define_peephole2): New peephole2 to combine zero_extend > with lea. > > gcc/testsuite/ChangeLog: > > PR target/101716 > * gcc.target/i386/pr101716.c: New test.
This form should be covered by ix86_decompose_address via address_no_seg_operand predicate. Combine creates: Trying 6 -> 7: 6: {r86:DI=r87:DI<<0x1;clobber flags:CC;} REG_DEAD r87:DI REG_UNUSED flags:CC 7: r85:DI=zero_extend(r86:DI#0) REG_DEAD r86:DI Failed to match this instruction: (set (reg:DI 85) (and:DI (ashift:DI (reg:DI 87) (const_int 1 [0x1])) (const_int 4294967294 [0xfffffffe]))) which does not fit: else if (GET_CODE (addr) == AND && const_32bit_mask (XEXP (addr, 1), DImode)) After reload, we lose SUBREG, so REE does not trigger on: (insn 17 3 7 2 (set (reg:DI 0 ax [86]) (mult:DI (reg:DI 5 di [87]) (const_int 2 [0x2]))) "pr101716.c":4:13 204 {*leadi} (nil)) (insn 7 17 13 2 (set (reg:DI 0 ax [85]) (zero_extend:DI (reg:SI 0 ax [86]))) "pr101716.c":4:19 136 {*zero_extendsidi2} (nil)) So, the question is if the combine pass really needs to zero-extend with 0xfffffffe, the left shift << 1 guarantees zero in the LSB, so 0xffffffff should be better and in line with canonical zero-extension RTX. > --- > gcc/config/i386/i386.md | 20 ++++++++++++++++++++ > gcc/testsuite/gcc.target/i386/pr101716.c | 11 +++++++++++ > 2 files changed, 31 insertions(+) > create mode 100644 gcc/testsuite/gcc.target/i386/pr101716.c > > diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md > index 4a8e8fea290..6739dbd799b 100644 > --- a/gcc/config/i386/i386.md > +++ b/gcc/config/i386/i386.md > @@ -5187,6 +5187,26 @@ > (const_string "SI") > (const_string "<MODE>")))]) > > +;; combine zero_extendsidi with lea to use leal. > +(define_insn "*lea<mode>_zext" > + [(set (match_operand:DI 0 "register_operand" "=r") > + (zero_extend:DI > + (match_operand:SWI48 1 "address_no_seg_operand" "Ts")))] > + "TARGET_64BIT" > + "lea{l}\t{%E1, %k0|%k0,%E1}") The above can lead to invalid RTX: (zero_extend:DI (... DImode RTX)). Uros. > + > +(define_peephole2 > + [(set (match_operand:SWI48 0 "general_reg_operand") > + (match_operand:SWI48 1 "address_no_seg_operand")) > + (set (match_operand:DI 2 "general_reg_operand") > + (zero_extend:DI (match_operand:SI 3 "general_reg_operand")))] > + "TARGET_64BIT && ix86_hardreg_mov_ok (operands[2], operands[1]) > + && REGNO (operands[0]) == REGNO (operands[3]) > + && (REGNO (operands[2]) == REGNO (operands[3]) > + || peep2_reg_dead_p (2, operands[3]))" > + [(set (match_dup 2) > + (zero_extend:DI (match_dup 1)))]) > + > (define_peephole2 > [(set (match_operand:SWI48 0 "register_operand") > (match_operand:SWI48 1 "address_no_seg_operand"))] > diff --git a/gcc/testsuite/gcc.target/i386/pr101716.c > b/gcc/testsuite/gcc.target/i386/pr101716.c > new file mode 100644 > index 00000000000..0b684755c2f > --- /dev/null > +++ b/gcc/testsuite/gcc.target/i386/pr101716.c > @@ -0,0 +1,11 @@ > +/* PR target/101716 */ > +/* { dg-do compile { target { ! ia32 } } } */ > +/* { dg-options "-O2" } */ > + > +/* { dg-final { scan-assembler "leal\[\\t \]\*eax" } } */ > +/* { dg-final { scan-assembler-not "movl\[\\t \]\*eax" } } */ > + > +unsigned long long sample1(unsigned long long m) { > + unsigned int t = -1; > + return (m << 1) & t; > +} > -- > 2.18.1 >