On Tue, Nov 26, 2024 at 2:25 AM <pan2...@intel.com> wrote: > > From: Pan Li <pan2...@intel.com> > > There are some forms like below failed to recog the SAT_ADD
Some forms like below failed to be recognized as SAT_ADD ... > pattern for target i386. It is related to some match pattern > extraction but get fixed after the refactor of the SAT_ADD > pattern. Thus, add testcases to ensure we may have similar > issue in futrue. > > #define DEF_SAT_ADD(T) \ > T sat_add_##T (T x, T y) \ > { \ > T res; \ > res = x + y; \ > res |= -(T)(res < x); \ > return res; \ > } > > #define VEC_DEF_SAT_ADD(T) \ > void vec_sat_add(T * restrict a, T * restrict b) \ > { \ > for (int i = 0; i < 8; i++) \ > b[i] = sat_add_##T (a[i], b[i]); \ > } > > DEF_SAT_ADD (uint32_t) > VEC_DEF_SAT_ADD (uint32_t) > > The below test suites are passed for this patch. > make -k check-gcc RUNTESTFLAGS="--target_board=unix\{,-m32\} > i386.exp=pr112600-5a-*.c" > > PR target/112600 > > gcc/testsuite/ChangeLog: > > * gcc.target/i386/pr112600-5a-u16.c: New test. > * gcc.target/i386/pr112600-5a-u32.c: New test. > * gcc.target/i386/pr112600-5a-u64.c: New test. > * gcc.target/i386/pr112600-5a-u8.c: New test. > * gcc.target/i386/pr112600-5a.h: New test. > > Signed-off-by: Pan Li <pan2...@intel.com> > --- > .../gcc.target/i386/pr112600-5a-u16.c | 10 +++++++++ > .../gcc.target/i386/pr112600-5a-u32.c | 10 +++++++++ > .../gcc.target/i386/pr112600-5a-u64.c | 12 ++++++++++ > .../gcc.target/i386/pr112600-5a-u8.c | 11 ++++++++++ > gcc/testsuite/gcc.target/i386/pr112600-5a.h | 22 +++++++++++++++++++ > 5 files changed, 65 insertions(+) > create mode 100644 gcc/testsuite/gcc.target/i386/pr112600-5a-u16.c > create mode 100644 gcc/testsuite/gcc.target/i386/pr112600-5a-u32.c > create mode 100644 gcc/testsuite/gcc.target/i386/pr112600-5a-u64.c > create mode 100644 gcc/testsuite/gcc.target/i386/pr112600-5a-u8.c > create mode 100644 gcc/testsuite/gcc.target/i386/pr112600-5a.h > > diff --git a/gcc/testsuite/gcc.target/i386/pr112600-5a-u16.c > b/gcc/testsuite/gcc.target/i386/pr112600-5a-u16.c > new file mode 100644 > index 00000000000..8a4a3e4443c > --- /dev/null > +++ b/gcc/testsuite/gcc.target/i386/pr112600-5a-u16.c > @@ -0,0 +1,10 @@ > +/* PR target/112600 */ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -msse2 -fdump-tree-optimized" } */ > + > +#include "pr112600-5a.h" > + > +DEF_SAT_ADD (uint16_t) > +VEC_DEF_SAT_ADD (uint16_t) > + > +/* { dg-final { scan-tree-dump-times ".SAT_ADD " 2 "optimized" } } */ > diff --git a/gcc/testsuite/gcc.target/i386/pr112600-5a-u32.c > b/gcc/testsuite/gcc.target/i386/pr112600-5a-u32.c > new file mode 100644 > index 00000000000..3a35f4c9770 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/i386/pr112600-5a-u32.c > @@ -0,0 +1,10 @@ > +/* PR target/112600 */ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -msse2 -fdump-tree-optimized" } */ > + > +#include "pr112600-5a.h" > + > +DEF_SAT_ADD (uint32_t) > +VEC_DEF_SAT_ADD (uint32_t) Remove the vector form, since we know it won't be recognized. > +/* { dg-final { scan-tree-dump-times ".SAT_ADD " 1 "optimized" } } */ > diff --git a/gcc/testsuite/gcc.target/i386/pr112600-5a-u64.c > b/gcc/testsuite/gcc.target/i386/pr112600-5a-u64.c > new file mode 100644 > index 00000000000..57d05d33fd7 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/i386/pr112600-5a-u64.c > @@ -0,0 +1,12 @@ > +/* PR target/112600 */ > +/* { dg-do compile } */ /* { dg-do compile { target { ! ia32 } } } */ to limit the testcase for 64bit targets only. > +/* { dg-options "-O2 -msse2 -fdump-tree-optimized" } */ > + > +#include "pr112600-5a.h" > + > +DEF_SAT_ADD (uint64_t) > +VEC_DEF_SAT_ADD (uint64_t) Remove the vector form, since we know it won't be recognized. > + > +/* { dg-final { scan-tree-dump-times ".SAT_ADD " 0 "optimized" { target { > any-opts { "-m32" } } } } } */ > +/* { dg-final { scan-tree-dump-times ".SAT_ADD " 2 "optimized" { target { > no-opts { "-m32" } } } } } */ Why are there 2 instances detected for the 64bit target? Only scalar form can be optimized, so I'd expect: /* { dg-final { scan-tree-dump-times ".SAT_ADD " 1 "optimized" } } */ > diff --git a/gcc/testsuite/gcc.target/i386/pr112600-5a-u8.c > b/gcc/testsuite/gcc.target/i386/pr112600-5a-u8.c > new file mode 100644 > index 00000000000..f8f224af730 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/i386/pr112600-5a-u8.c > @@ -0,0 +1,11 @@ > +/* PR target/112600 */ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -msse2 -fdump-tree-optimized" } */ > + > +#include "pr112600-5a.h" > + > +DEF_SAT_ADD (uint8_t) > +VEC_DEF_SAT_ADD (uint8_t) > + > +/* { dg-final { scan-tree-dump-times ".SAT_ADD " 3 "optimized" { target { > any-opts { "-m32" } } } } } */ > +/* { dg-final { scan-tree-dump-times ".SAT_ADD " 2 "optimized" { target { > no-opts { "-m32" } } } } } */ Why are the results different between 32bit and 64bit targets? Results should be the same because both scalar and vector uint8_t forms can be optimized for both targets, similar to the uint16_t case. Uros. > diff --git a/gcc/testsuite/gcc.target/i386/pr112600-5a.h > b/gcc/testsuite/gcc.target/i386/pr112600-5a.h > new file mode 100644 > index 00000000000..1e753695e81 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/i386/pr112600-5a.h > @@ -0,0 +1,22 @@ > +#ifndef HAVE_DEFINED_PR112600_5A_H > +#define HAVE_DEFINED_PR112600_5A_H > + > +#include <stdint.h> > + > +#define DEF_SAT_ADD(T) \ > +T sat_add_##T (T x, T y) \ > +{ \ > + T res; \ > + res = x + y; \ > + res |= -(T)(res < x); \ > + return res; \ > +} > + > +#define VEC_DEF_SAT_ADD(T) \ > +void vec_sat_add(T * restrict a, T * restrict b) \ > +{ \ > + for (int i = 0; i < 8; i++) \ > + b[i] = sat_add_##T (a[i], b[i]); \ > +} > + > +#endif > -- > 2.43.0 >