On Sat, Oct 26, 2024 at 12:20 AM Andrew Pinski <pins...@gmail.com> wrote: > > On Thu, Oct 24, 2024 at 6:22 PM Li Xu <xu...@eswincomputing.com> wrote: > > > > From: xuli <xu...@eswincomputing.com> > > > > When the imm operand op1=1 in the unsigned scalar sat_sub form2 below, > > we can simplify (x != 0 ? x + ~0 : 0) to (x - x != 0), thereby eliminating > > a branch instruction.This simplification also applies to signed integer. > > > > Form2: > > T __attribute__((noinline)) \ > > sat_u_sub_imm##IMM##_##T##_fmt_2 (T x) \ > > { \ > > return x >= (T)IMM ? x - (T)IMM : 0; \ > > } > > > > Take below form 2 as example: > > DEF_SAT_U_SUB_IMM_FMT_2(uint8_t, 1) > > > > Before this patch: > > __attribute__((noinline)) > > uint8_t sat_u_sub_imm1_uint8_t_fmt_2 (uint8_t x) > > { > > uint8_t _1; > > uint8_t _3; > > > > <bb 2> [local count: 1073741824]: > > if (x_2(D) != 0) > > goto <bb 3>; [50.00%] > > else > > goto <bb 4>; [50.00%] > > > > <bb 3> [local count: 536870912]: > > _3 = x_2(D) + 255; > > > > <bb 4> [local count: 1073741824]: > > # _1 = PHI <x_2(D)(2), _3(3)> > > return _1; > > > > } > > > > Assembly code: > > sat_u_sub_imm1_uint8_t_fmt_2: > > beq a0,zero,.L2 > > addiw a0,a0,-1 > > andi a0,a0,0xff > > .L2: > > ret > > > > After this patch: > > __attribute__((noinline)) > > uint8_t sat_u_sub_imm1_uint8_t_fmt_2 (uint8_t x) > > { > > _Bool _1; > > unsigned char _2; > > uint8_t _4; > > > > <bb 2> [local count: 1073741824]: > > _1 = x_3(D) != 0; > > _2 = (unsigned char) _1; > > _4 = x_3(D) - _2; > > return _4; > > > > } > > > > Assembly code: > > sat_u_sub_imm1_uint8_t_fmt_2: > > snez a5,a0 > > subw a0,a0,a5 > > andi a0,a0,0xff > > ret > > > > The below test suites are passed for this patch: > > 1. The rv64gcv fully regression tests. > > 2. The x86 bootstrap tests. > > 3. The x86 fully regression tests. > > > > Signed-off-by: Li Xu <xu...@eswincomputing.com> > > gcc/ChangeLog: > > > > * match.pd: Simplify (x != 0 ? x + ~0 : 0) to (x - x != 0). > > > > gcc/testsuite/ChangeLog: > > > > * gcc.dg/tree-ssa/phi-opt-44.c: New test. > > --- > > gcc/match.pd | 10 +++++++++ > > gcc/testsuite/gcc.dg/tree-ssa/phi-opt-44.c | 26 ++++++++++++++++++++++ > > gcc/testsuite/gcc.dg/tree-ssa/phi-opt-45.c | 26 ++++++++++++++++++++++ > > 3 files changed, 62 insertions(+) > > create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/phi-opt-44.c > > create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/phi-opt-45.c > > > > diff --git a/gcc/match.pd b/gcc/match.pd > > index 0455dfa6993..f48fd7d52ba 100644 > > --- a/gcc/match.pd > > +++ b/gcc/match.pd > > @@ -3383,6 +3383,16 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) > > } > > (if (wi::eq_p (sum, wi::uhwi (0, precision))))))) > > > > +/* The boundary condition for case 10: IMM = 1: > > + SAT_U_SUB = X >= IMM ? (X - IMM) : 0. > > + simplify (X != 0 ? X + ~0 : 0) to (X - X != 0). */ > > +(simplify > > + (cond (ne@1 @0 integer_zerop) > > + (nop_convert? (plus (nop_convert? @0) integer_all_onesp)) > > + integer_zerop) > > + (if (INTEGRAL_TYPE_P (type)) > > + (minus @0 (convert @1)))) > > This looks good to me, though I can't approve it.
OK. Thanks, Richard. > Thanks, > Andrew > > > + > > /* Signed saturation sub, case 1: > > T minus = (T)((UT)X - (UT)Y); > > SAT_S_SUB = (X ^ Y) & (X ^ minus) < 0 ? (-(T)(X < 0) ^ MAX) : minus; > > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/phi-opt-44.c > > b/gcc/testsuite/gcc.dg/tree-ssa/phi-opt-44.c > > new file mode 100644 > > index 00000000000..962bf0954f6 > > --- /dev/null > > +++ b/gcc/testsuite/gcc.dg/tree-ssa/phi-opt-44.c > > @@ -0,0 +1,26 @@ > > +/* { dg-do compile } */ > > +/* { dg-options "-O2 -fdump-tree-phiopt1" } */ > > + > > +#include <stdint.h> > > + > > +uint8_t f1 (uint8_t x) > > +{ > > + return x >= (uint8_t)1 ? x - (uint8_t)1 : 0; > > +} > > + > > +uint16_t f2 (uint16_t x) > > +{ > > + return x >= (uint16_t)1 ? x - (uint16_t)1 : 0; > > +} > > + > > +uint32_t f3 (uint32_t x) > > +{ > > + return x >= (uint32_t)1 ? x - (uint32_t)1 : 0; > > +} > > + > > +uint64_t f4 (uint64_t x) > > +{ > > + return x >= (uint64_t)1 ? x - (uint64_t)1 : 0; > > +} > > + > > +/* { dg-final { scan-tree-dump-not "goto" "phiopt1" } } */ > > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/phi-opt-45.c > > b/gcc/testsuite/gcc.dg/tree-ssa/phi-opt-45.c > > new file mode 100644 > > index 00000000000..62a2ab63184 > > --- /dev/null > > +++ b/gcc/testsuite/gcc.dg/tree-ssa/phi-opt-45.c > > @@ -0,0 +1,26 @@ > > +/* { dg-do compile } */ > > +/* { dg-options "-O2 -fdump-tree-phiopt1" } */ > > + > > +#include <stdint.h> > > + > > +int8_t f1 (int8_t x) > > +{ > > + return x != 0 ? x - (int8_t)1 : 0; > > +} > > + > > +int16_t f2 (int16_t x) > > +{ > > + return x != 0 ? x - (int16_t)1 : 0; > > +} > > + > > +int32_t f3 (int32_t x) > > +{ > > + return x != 0 ? x - (int32_t)1 : 0; > > +} > > + > > +int64_t f4 (int64_t x) > > +{ > > + return x != 0 ? x - (int64_t)1 : 0; > > +} > > + > > +/* { dg-final { scan-tree-dump-not "goto" "phiopt1" } } */ > > -- > > 2.17.1 > >