On Tue, Nov 28, 2023 at 7:56 PM Andrew Pinski <pins...@gmail.com> wrote: > > On Tue, Nov 28, 2023 at 7:38 AM <ezra.sito...@arm.com> wrote: > > > > From: Ezra Sitorus <ezra.sito...@arm.com> > > > > Add the transform rule (T)(~A) -> ~(T)(A) for view_convert. The simplified > > result could be a single assembly instruction when chained with other > > instructions. > > > > gcc/ChangeLog: > > * match.pd: Add new transform rule. > > * > > testsuite/gcc.target/aarch64/advsimd-intrinsics/vreinterpretq_vmvnq.c: Add > > new test > > * testsuite/gcc.target/arm/simd/vreinterpretq_vmvnq_1.c: Add new > > test > > --- > > gcc/match.pd | 5 ++++ > > .../advsimd-intrinsics/vreinterpretq_vmvnq.c | 25 ++++++++++++++++++ > > .../arm/simd/vreinterpretq_vmvnq_1.c | 26 +++++++++++++++++++ > > 3 files changed, 56 insertions(+) > > create mode 100644 > > gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vreinterpretq_vmvnq.c > > create mode 100644 > > gcc/testsuite/gcc.target/arm/simd/vreinterpretq_vmvnq_1.c > > > > diff --git a/gcc/match.pd b/gcc/match.pd > > index 95225e4ca5f..273230a7681 100644 > > --- a/gcc/match.pd > > +++ b/gcc/match.pd > > @@ -3576,6 +3576,11 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) > > && !TYPE_OVERFLOW_SANITIZED (type)) > > (convert (op! @0 @1))))) > > > > +/* (T)(~A) -> ~(T)A */ > > + (simplify > > + (view_convert (bit_not @0)) > > + (bit_not (view_convert @0))) > > This is not wrong for a few reasons. The outer type needs to be an > integral (scalar or vector) type for this to be valid. Plus this might > not be a good (or valid) idea to do for boolean types. > So I think the following check would be needed: > if ((INTEGRAL_TYPE_P (type) && TREE_CODE (type) != BOOLEAN_TYPE) > || (VECTOR_TYPE_P (type) && INTEGRAL_TYPE_P (TREE_TYPE (type)) > && !VECTOR_BOOLEAN_TYPE_P (type))) > > Note this might also cause issues with enum types which sometimes have > constrained type ranges.
It's also not a simplification but at most a canonicalization where it's not clear which order is better. You also have to check for target support for the bit_not operation on vectors since a V_C_E can change the vector mode. I also believe there's no corresponding canonicalization for (convert (bit_not ..)) (to be equivalent it would be restricted to no-op converts I guess). Richard. > Thanks, > Andrew Pinski > > > > + > > /* ~A + A -> -1 */ > > (simplify > > (plus:c (convert? (bit_not @0)) (convert? @0)) > > diff --git > > a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vreinterpretq_vmvnq.c > > b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vreinterpretq_vmvnq.c > > new file mode 100644 > > index 00000000000..ed82c844bd4 > > --- /dev/null > > +++ > > b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vreinterpretq_vmvnq.c > > @@ -0,0 +1,25 @@ > > +/* { dg-do compile } */ > > +/* { dg-options "-O2" } */ > > + > > +#include <arm_neon.h> > > + > > +int64x2_t test_vector1(int32x4_t a, int32x4_t b) > > +{ > > + return vandq_s64(vreinterpretq_s64_s32(vmvnq_s32(a)), > > + vreinterpretq_s64_s32(b)); > > +} > > + > > +int64x2_t test_vector2(int32x4_t a, int16x8_t b) > > +{ > > + return vandq_s64(vreinterpretq_s64_s32(vmvnq_s32(a)), > > + vreinterpretq_s64_s16(b)); > > +} > > + > > +int64x2_t test_vector3(int32x4_t a, int64x2_t b) > > +{ > > + return vandq_s64(vreinterpretq_s64_s32(vmvnq_s32(a)), b); > > +} > > + > > +/* { dg-final { scan-assembler-times {\tbic\t} 3 } } */ > > +/* { dg-final { scan-assembler-not {\tand\t} } } */ > > +/* { dg-final { scan-assembler-not {\tmvn\t} } } */ > > diff --git a/gcc/testsuite/gcc.target/arm/simd/vreinterpretq_vmvnq_1.c > > b/gcc/testsuite/gcc.target/arm/simd/vreinterpretq_vmvnq_1.c > > new file mode 100644 > > index 00000000000..a34425100ea > > --- /dev/null > > +++ b/gcc/testsuite/gcc.target/arm/simd/vreinterpretq_vmvnq_1.c > > @@ -0,0 +1,26 @@ > > +/* { dg-do compile } */ > > +/* { dg-options "-O2" } */ > > +/* { dg-additional-options "-march=armv8.2-a -mfloat-abi=hard -mfpu=neon" > > } */ > > + > > +#include <arm_neon.h> > > + > > +int64x2_t test_vector1(int32x4_t a, int32x4_t b) > > +{ > > + return vandq_s64(vreinterpretq_s64_s32(vmvnq_s32(a)), > > + vreinterpretq_s64_s32(b)); > > +} > > + > > +int64x2_t test_vector2(int32x4_t a, int16x8_t b) > > +{ > > + return vandq_s64(vreinterpretq_s64_s32(vmvnq_s32(a)), > > + vreinterpretq_s64_s16(b)); > > +} > > + > > +int64x2_t test_vector3(int32x4_t a, int64x2_t b) > > +{ > > + return vandq_s64(vreinterpretq_s64_s32(vmvnq_s32(a)), b); > > +} > > + > > +/* { dg-final { scan-assembler-times {\tvbic\t} 3 } } */ > > +/* { dg-final { scan-assembler-not {\tvand\t} } } */ > > +/* { dg-final { scan-assembler-not {\tvmvn\t} } } */ > > -- > > 2.25.1 > >