Nit: for title, should be SAT_SUB(x, 1) instead of SAT_SUB(x - 1)? Pan
-----Original Message----- From: Andrew Pinski <pins...@gmail.com> Sent: Wednesday, October 23, 2024 1:46 PM To: Li Xu <xu...@eswincomputing.com> Cc: gcc-patches@gcc.gnu.org; kito.ch...@gmail.com; richard.guent...@gmail.com; tamar.christ...@arm.com; juzhe.zh...@rivai.ai; Li, Pan2 <pan2...@intel.com>; jeffreya...@gmail.com; rdapp....@gmail.com Subject: Re: [PATCH 1/2] Match: Simplify unsigned scalar sat_sub(x -1) to (x - x != 0) On Tue, Oct 22, 2024 at 10:39 PM Li Xu <xu...@eswincomputing.com> wrote: > > From: xuli <xu...@eswincomputing.com> > > When the imm operand op1=1 in the unsigned scalar sat_sub form2 below, > we can simplify (x != 0 ? x + max : 0) to (x - x != 0), thereby eliminating > a branch instruction. > > Form2: > T __attribute__((noinline)) \ > sat_u_sub_imm##IMM##_##T##_fmt_2 (T x) \ > { \ > return x >= (T)IMM ? x - (T)IMM : 0; \ > } > > Take below form 2 as example: > DEF_SAT_U_SUB_IMM_FMT_2(uint8_t, 1) > > Before this patch: > __attribute__((noinline)) > uint8_t sat_u_sub_imm1_uint8_t_fmt_2 (uint8_t x) > { > uint8_t _1; > uint8_t _3; > > <bb 2> [local count: 1073741824]: > if (x_2(D) != 0) > goto <bb 3>; [50.00%] > else > goto <bb 4>; [50.00%] > > <bb 3> [local count: 536870912]: > _3 = x_2(D) + 255; > > <bb 4> [local count: 1073741824]: > # _1 = PHI <x_2(D)(2), _3(3)> > return _1; > > } > > Assembly code: > sat_u_sub_imm1_uint8_t_fmt_2: > beq a0,zero,.L2 > addiw a0,a0,-1 > andi a0,a0,0xff > .L2: > ret > > After this patch: > __attribute__((noinline)) > uint8_t sat_u_sub_imm1_uint8_t_fmt_2 (uint8_t x) > { > _Bool _1; > unsigned char _2; > uint8_t _4; > > <bb 2> [local count: 1073741824]: > _1 = x_3(D) != 0; > _2 = (unsigned char) _1; > _4 = x_3(D) - _2; > return _4; > > } > > Assembly code: > sat_u_sub_imm1_uint8_t_fmt_2: > snez a5,a0 > subw a0,a0,a5 > andi a0,a0,0xff > ret > > The below test suites are passed for this patch: > 1. The rv64gcv fully regression tests. > 2. The x86 bootstrap tests. > 3. The x86 fully regression tests. > > Signed-off-by: Li Xu <xu...@eswincomputing.com> > gcc/ChangeLog: > > * match.pd: Simplify (x != 0 ? x + max : 0) to (x - x != 0). > --- > gcc/match.pd | 16 ++++++++++++++++ > 1 file changed, 16 insertions(+) > > diff --git a/gcc/match.pd b/gcc/match.pd > index 0455dfa6993..9dcb9555cc0 100644 > --- a/gcc/match.pd > +++ b/gcc/match.pd > @@ -3383,6 +3383,22 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) > } > (if (wi::eq_p (sum, wi::uhwi (0, precision))))))) > > +/* The boundary condition for case 10: IMM = 1: > + SAT_U_SUB = X >= IMM ? (X - IMM) : 0. > + simplify (X != 0 ? X + max : 0) to (X - X != 0). */ > +(simplify > + (cond (ne @0 integer_zerop) (plus @0 INTEGER_CST@1) integer_zerop) I think you could use integer_all_onesp@1 here rather than the check you are doing on the INTEGER_CST. Then you only need the INTEGRAL_TYPE_P check. You should also add a few testcases for this. See `gcc.dg/tree-ssa/phiopt*.c` for similar testcases. Thanks, Andrew Pinski > + (if (INTEGRAL_TYPE_P (type) && TYPE_UNSIGNED (type) > + && types_match (type, @0)) > + (with > + { > + unsigned precision = TYPE_PRECISION (type); > + wide_int max = wi::mask (precision, false, precision); > + wide_int c1 = wi::to_wide (@1); > + } > + (if (wi::eq_p (c1, max)) > + (minus @0 (convert (ne @0 { build_zero_cst (type); }))))))) > + > /* Signed saturation sub, case 1: > T minus = (T)((UT)X - (UT)Y); > SAT_S_SUB = (X ^ Y) & (X ^ minus) < 0 ? (-(T)(X < 0) ^ MAX) : minus; > -- > 2.17.1 >