Jakub Jelinek <ja...@redhat.com> writes: > Hi! > > Similarly to PR87763 for bfi, the GCC 9 combiner changes to not combine > moves from hard registers regressed the following testcase where we no > longer recognize bfxil and emit 3 instructions instead. > > The following patch adds define_insn patterns that match what the combiner > is trying to match in these cases. I haven't been able to see patterns > with the other order of the IOR operands, seems the IL is canonicalized this > way no matter what is written in the source.
Thanks for doing this. LGTM with one nit… > Bootstrapped/regtested on aarch64-linux, ok for trunk? > > 2021-04-12 Jakub Jelinek <ja...@redhat.com> > > PR target/100028 > * config/aarch64/aarch64.md (*aarch64_bfxil<mode>_extr, > *aarch64_bfxilsi_extrdi): New define_insn patterns. > > * gcc.target/aarch64/pr100028.c: New test. > > --- gcc/config/aarch64/aarch64.md.jj 2021-04-10 12:45:40.702654372 +0200 > +++ gcc/config/aarch64/aarch64.md 2021-04-12 17:46:03.610503988 +0200 > @@ -5601,6 +5601,38 @@ (define_insn "*aarch64_bfi<GPI:mode>4_no > [(set_attr "type" "bfm")] > ) > > +(define_insn "*aarch64_bfxil<mode>_extr" > + [(set (match_operand:GPI 0 "register_operand" "=r") > + (ior:GPI (and:GPI (match_operand:GPI 1 "register_operand" "0") > + (match_operand:GPI 2 "const_int_operand" "n")) > + (zero_extract:GPI > + (match_operand:GPI 3 "register_operand" "r") > + (match_operand:GPI 4 "aarch64_simd_shift_imm_<mode>" "n") > + (match_operand:GPI 5 "aarch64_simd_shift_imm_<mode>" "n"))))] > + "UINTVAL (operands[2]) == HOST_WIDE_INT_M1U << INTVAL (operands[4]) > + && INTVAL (operands[4]) > + && (UINTVAL (operands[4]) + UINTVAL (operands[5]) > + <= GET_MODE_BITSIZE (<MODE>mode))" > + "bfxil\t%<GPI:w>0, %<GPI:w>3, %5, %4" > + [(set_attr "type" "bfm")] > +) > + > +(define_insn "*aarch64_bfxilsi_extrdi" > + [(set (match_operand:SI 0 "register_operand" "=r") > + (ior:SI (and:SI (match_operand:SI 1 "register_operand" "0") > + (match_operand:SI 2 "const_int_operand" "n")) > + (match_operator:SI 6 "subreg_lowpart_operator" > + [(zero_extract:DI > + (subreg:DI (match_operand:SI 3 "register_operand" "r") 0) I don't think we need to restrict this to subregs. It should be OK to have any DI register_operand, since the C condition ensures that the extracted bits come from the low 32 bits specified by the %w3 operand. OK with that change, thanks. Richard > + (match_operand:SI 4 "aarch64_simd_shift_imm_si" "n") > + (match_operand:SI 5 "aarch64_simd_shift_imm_si" "n"))])))] > + "UINTVAL (operands[2]) == HOST_WIDE_INT_M1U << INTVAL (operands[4]) > + && INTVAL (operands[4]) > + && UINTVAL (operands[4]) + UINTVAL (operands[5]) <= 32" > + "bfxil\t%w0, %w3, %5, %4" > + [(set_attr "type" "bfm")] > +) > + > (define_insn "*extr_insv_lower_reg<mode>" > [(set (zero_extract:GPI (match_operand:GPI 0 "register_operand" "+r") > (match_operand 1 "const_int_operand" "n") > --- gcc/testsuite/gcc.target/aarch64/pr100028.c.jj 2021-04-12 > 17:38:38.665472799 +0200 > +++ gcc/testsuite/gcc.target/aarch64/pr100028.c 2021-04-12 > 17:38:18.470698594 +0200 > @@ -0,0 +1,22 @@ > +/* PR target/100028 */ > +/* { dg-do compile } */ > +/* { dg-options "-O2" } */ > + > +#define W 3 > +#define L 11 > + > +int > +foo (int d, int s) > +{ > + int wmask = (1 << W) - 1; > + return (d & ~wmask) | ((s >> L) & wmask); > +} > + > +long long int > +bar (long long int d, long long int s) > +{ > + long long int wmask = (1LL << W) - 1; > + return (d & ~wmask) | ((s >> L) & wmask); > +} > + > +/* { dg-final { scan-assembler-times {\tbfxil\t} 2 } } */ > > Jakub