> -----Original Message----- > From: Tamar Christina <tamar.christ...@arm.com> > Sent: Wednesday, September 29, 2021 5:22 PM > To: gcc-patches@gcc.gnu.org > Cc: nd <n...@arm.com>; Richard Earnshaw <richard.earns...@arm.com>; > Marcus Shawcroft <marcus.shawcr...@arm.com>; Kyrylo Tkachov > <kyrylo.tkac...@arm.com>; Richard Sandiford > <richard.sandif...@arm.com> > Subject: [PATCH 7/7]AArch64 Combine cmeq 0 + not into cmtst > > Hi All, > > This turns a bitwise inverse of an equality comparison with 0 into a compare > of > bitwise nonzero (cmtst). > > We already have one pattern for cmsts, this adds an additional one which > does > not require an additional bitwise and. > > i.e. > > #include <arm_neon.h> > > uint8x8_t bar(int16x8_t abs_row0, int16x8_t row0) { > uint16x8_t row0_diff = > vreinterpretq_u16_s16(veorq_s16(abs_row0, vshrq_n_s16(row0, 15))); > uint8x8_t abs_row0_gt0 = > vmovn_u16(vcgtq_u16(vreinterpretq_u16_s16(abs_row0), > vdupq_n_u16(0))); > return abs_row0_gt0; > } > > now generates: > > bar: > cmtst v0.8h, v0.8h, v0.8h > xtn v0.8b, v0.8h > ret > > instead of: > > bar: > cmeq v0.8h, v0.8h, #0 > not v0.16b, v0.16b > xtn v0.8b, v0.8h > ret > > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. > > Ok for master? > > Thanks, > Tamar > > gcc/ChangeLog: > > * config/aarch64/aarch64-simd.md > (*aarch64_cmtst_same_<mode>): New. > > gcc/testsuite/ChangeLog: > > * gcc.target/aarch64/mvn-cmeq0-1.c: New test. > > --- inline copy of patch -- > diff --git a/gcc/config/aarch64/aarch64-simd.md > b/gcc/config/aarch64/aarch64-simd.md > index > 9d936428b438c95b56614c94081d7e2ebc47d89f..bce01c36386074bf475b8b7 > e5c69a1959a13fef3 100644 > --- a/gcc/config/aarch64/aarch64-simd.md > +++ b/gcc/config/aarch64/aarch64-simd.md > @@ -6585,6 +6585,23 @@ (define_insn "aarch64_cmtst<mode>" > [(set_attr "type" "neon_tst<q>")] > ) > > +;; One can also get a cmtsts by having to combine a > +;; not (neq (eq x 0)) in which case you rewrite it to > +;; a comparison against itself > + > +(define_insn "*aarch64_cmtst_same_<mode>" > + [(set (match_operand:<V_INT_EQUIV> 0 "register_operand" "=w") > + (plus:<V_INT_EQUIV> > + (eq:<V_INT_EQUIV> > + (match_operand:VDQ_I 1 "register_operand" "w") > + (match_operand:VDQ_I 2 "aarch64_simd_imm_zero")) > + (match_operand:<V_INT_EQUIV> 3 > "aarch64_simd_imm_minus_one"))) > + ] > + "TARGET_SIMD" > + "cmtst\t%<v>0<Vmtype>, %<v>1<Vmtype>, %<v>1<Vmtype>" > + [(set_attr "type" "neon_tst<q>")] > +) > + > (define_insn_and_split "aarch64_cmtstdi" > [(set (match_operand:DI 0 "register_operand" "=w,r") > (neg:DI > diff --git a/gcc/testsuite/gcc.target/aarch64/mvn-cmeq0-1.c > b/gcc/testsuite/gcc.target/aarch64/mvn-cmeq0-1.c > new file mode 100644 > index > 0000000000000000000000000000000000000000..59f3a230271c70d3bb51d03 > 38d9ec2613bd4394b > --- /dev/null > +++ b/gcc/testsuite/gcc.target/aarch64/mvn-cmeq0-1.c > @@ -0,0 +1,17 @@ > +/* { dg-do assemble } */ > +/* { dg-options "-O3 --save-temps --param=vect-epilogues-nomask=0" } */
I don't think we need the param here (or even anything higher than -O really). Ok otherwise. Thanks, Kyrill > + > +#include <arm_neon.h> > + > +uint8x8_t bar(int16x8_t abs_row0, int16x8_t row0) { > + uint16x8_t row0_diff = > + vreinterpretq_u16_s16(veorq_s16(abs_row0, vshrq_n_s16(row0, 15))); > + uint8x8_t abs_row0_gt0 = > + vmovn_u16(vcgtq_u16(vreinterpretq_u16_s16(abs_row0), > vdupq_n_u16(0))); > + return abs_row0_gt0; > +} > + > + > +/* { dg-final { scan-assembler-times {\tcmtst\t} 1 } } */ > +/* { dg-final { scan-assembler-not {\tcmeq\t} } } */ > +/* { dg-final { scan-assembler-not {\tnot\t} } } */ > > > --