On Wed, 8 Jan 2025, Zhou Zhao wrote:

> 
> 在 2025/1/7 下午10:45, Richard Biener 写道:
> > On Thu, 2 Jan 2025, 赵洲 wrote:
> >
> >> Add Reviewer Richard Biener.
> >>
> >>
> >>> -----原始邮件-----
> >>> 发件人: "Zhou Zhao" <zhaoz...@loongson.cn>
> >>> 发送时间:2025-01-02 19:37:07 (星期四)
> >>> 收件人: gcc-patches@gcc.gnu.org
> >>> 抄送: xry...@xry111.site, i...@xen0n.name, chengl...@loongson.cn,
> >>> xucheng...@loongson.cn, zhaoz...@loongson.cn
> >>> 主题: [PATCH] match.pd: Fold pattern of round semantics.
> >>>
> >>> This patch implements 4 rules for semantics of round func in match.pd
> >>> under
> >>> -funsafe-math-optimizations:
> >>> 1) (x-floor(x)) < (ceil(x)-x) ? floor(x) : ceil(x) -> floor(x+0.5)
> >>> 2) (x-floor(x)) >= (ceil(x)-x) ? ceil(x) : floor(x) -> floor(x+0.5)
> >>> 3) (ceil(x)-x) > (x-floor(x)) ? floor(x) : ceil(x) -> floor(x+0.5)
> >>> 4) (ceil(x)-x) <= (x-floor(x)) ? ceil(x) : floor(x) -> floor(x+0.5)
> >>>
> >>> The patch implements floor(x+0.5) operation to replace semantics of
> >>> round(x) function.
> >>> The patch was regtested on aarch64-linux-gnu and x86_64-linux-gnu, SPEC
> >>> 2017 and SPEC 2006 were run:
> >>> As for SPEC 2017, 538.imagick_r benchmark performance increased by 3%+
> >>> in base test of ratio mode.
> >>> As for SPEC 2006, while the transform does not seem to be triggered, we
> >>> also see no non-noise impact on performance.
> >>> OK for mainline?
> >>>
> >>> gcc/ChangeLog:
> >>>
> >>>  * match.pd: Add new pattern for round.
> >>>
> >>> gcc/testsuite/ChangeLog:
> >>>
> >>>   * gcc.dg/fold-round-1.c: New test.
> >>> ---
> >>>   gcc/match.pd                        | 27 ++++++++++++++
> >>>   gcc/testsuite/gcc.dg/fold-round-1.c | 56 +++++++++++++++++++++++++++++
> >>>   2 files changed, 83 insertions(+)
> >>>   create mode 100644 gcc/testsuite/gcc.dg/fold-round-1.c
> >>>
> >>> diff --git a/gcc/match.pd b/gcc/match.pd
> >>> index 83eca8b2e0a..7b22b7913ac 100644
> >>> --- a/gcc/match.pd
> >>> +++ b/gcc/match.pd
> >>> @@ -777,6 +777,33 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> >>>    (rdiv @0 (negate @1))
> >>>    (rdiv (negate @0) @1))
> >>>   +(if (flag_unsafe_math_optimizations)
> >>> +/* convert semantics of round(x) function to floor(x+0.5) */
> >>> +/* (x-floor(x)) < (ceil(x)-x) ? floor(x) : ceil(x) --> floor(x+0.5) */
> >>> +/* (x-floor(x)) >= (ceil(x)-x) ? ceil(x) : floor(x) --> floor(x+0.5) */
> >>> +/* (ceil(x)-x) > (x-floor(x)) ? floor(x) : ceil(x) --> floor(x+0.5) */
> >>> +/* (ceil(x)-x) <= (x-floor(x)) ? ceil(x) : floor(x) --> floor(x+0.5) */
> 
> Hi, Richard
> 
> Thanks for your reply. This patch is not fixes a bug. If we can discuss
> 
> and reach a conclusion now, I will be re-posted/pinged when stage1 opens
> 
> for GCC 16.
> 
> Zhou Zhao.
> > I think you should be able to merge the cases for swapped compares
> > by just matching (cond (op:c (...)).
> 
> If use swapped compares to merge the cases, the expression
> 
> ((x - floor(x)) < (ceil(x) - x) ? floor(x) : ceil(x)) can be transformed to
> 
> ((ceil(x) - x) < (x - floor(x)) ? floor(x) : ceil(x)), this is not the
> expression
> 
> I want, not be simplified to floor(x + 0.5). For example, when x = 1.4
> 
> the result of new expression is ceil(1.4)=2.0, but floor(1.4+0.5)=1.0.
> 
> So I did not use swapped compares to merge the cases.

The swapped compare will match

(ceil(x) - x) > (x - floor(x) ? floor(x) : ceil(x)

> > I do wonder about the compares itself though, (x-floor(x)) < (ceil(x)-x)
> > isn't that true exactly when x > 0?  With flag_unsafe_optimizations
> > < vs. <= and thus x not having a fractional component would be
> > unimportant.  That said, wouldn't we expect the compares to be
> > simplified?
> 
> I agree your said, this expression can be simplified when x not haveing a
> 
> fractional component, but it's difficult to determine whether x having a
> 
> fractional component. Maybe constrain x is int type and matching
> 
> (floor (float@1 @0)) can simplify the expression. But that's not my purpose,
> 
> maybe the commits is not clear enough, I would rephrase it.
> 
> We only need to focus on the first pattern,the other three are different
> 
> forms of it. I will use "left distance" to represent the distance from x
> 
> to its left integer boundary, "right distance" to represent the distance
> 
> from x to its right integer boundary.
> 
> The pattern semantics as follows:
> 
> as for x, if left distance is less than right distance, return floor(x),
> 
> if left distance is greater or equal than right distance, return ceil(x).
> 
> For example, when x=1.4, the left distance is less than right distance,
> 
> return floor(x). (x+0.5) has not crossed the right boundary. floor(x+0.5)
> 
> is equal to floor(x). when x=1.5 or x=1.6, the left distance is not less
> 
> than right distance, return ceil(x). (x+0.5) has crossed the right
> 
> boundary. floor(x+0.5) is equal to ceil(x).

I see.  So how does it relate to rint/nearbyint?

> > I'm also curious where you got these expressions to be simplified from?
> 
> In the 538.imagick_r benchmark of Spec2017, I find this pattern from
> 
> MagickRound function. By simplifying the pattern, the performance of the
> 
> 538 benchmark can be improved by 3% to 4% on aarch64-linux-gnu and
>
> x86_64-linux-gnu.

Ah, OK, can you mention this in the commit message?

Thanks,
Richard.
 
> > Note we're in stage3 so unless this fixes a bug it isn't appropriate
> > at this stage and should be re-posted/pinged when stage1 opens for
> > GCC 16.
> >
> > Richard.
> >
> >>> +(for op (lt ge)
> >>> +     bt (FLOOR CEIL)
> >>> +     bf (CEIL FLOOR)
> >>> +     floor (FLOOR FLOOR)
> >>> +     ceil (CEIL CEIL)
> >>> + (simplify
> >>> +  (cond (op (minus:s SSA_NAME@0 (floor SSA_NAME@0))
> >>> +     (minus:s (ceil SSA_NAME@0) SSA_NAME@0))
> >>> + (bt SSA_NAME@0) (bf SSA_NAME@0))
> >>> +  (floor (plus @0 { build_real (type, dconsthalf); }))))
> >>> +(for op (gt le)
> >>> +     bt (FLOOR CEIL)
> >>> +     bf (CEIL FLOOR)
> >>> +     floor (FLOOR FLOOR)
> >>> +     ceil (CEIL CEIL)
> >>> + (simplify
> >>> +  (cond (op (minus:s (ceil SSA_NAME@0) SSA_NAME@0)
> >>> +     (minus:s SSA_NAME@0 (floor SSA_NAME@0)))
> >>> + (bt SSA_NAME@0) (bf SSA_NAME@0))
> >>> +  (floor (plus @0 { build_real (type, dconsthalf); })))))
> >>> +
> >>>   (if (flag_unsafe_math_optimizations)
> >>>    /* Simplify (C / x op 0.0) to x op 0.0 for C != 0, C != Inf/Nan.
> >>>       Since C / x may underflow to zero, do this only for unsafe math.  */
> >>> diff --git a/gcc/testsuite/gcc.dg/fold-round-1.c
> >>> b/gcc/testsuite/gcc.dg/fold-round-1.c
> >>> new file mode 100644
> >>> index 00000000000..845d6d2e475
> >>> --- /dev/null
> >>> +++ b/gcc/testsuite/gcc.dg/fold-round-1.c
> >>> @@ -0,0 +1,56 @@
> >>> +/* { dg-do compile } */
> >>> +/* { dg-options "-Ofast -funsafe-math-optimizations" } */
> >>> +
> >>> +extern void link_error (void);
> >>> +
> >>> +#define TEST_ROUND(TYPE, FFLOOR, FCEIL)
> >>> \
> >>> +  void round_##FFLOOR##_1 (TYPE x)
> >>> \
> >>> +  {
> >>> \
> >>> +    TYPE t1 = 0;
> >>> \
> >>> +    TYPE t2 = __builtin_##FFLOOR (x + 0.5);
> >>> \
> >>> +    if ((x - __builtin_##FFLOOR (x)) < (__builtin_##FCEIL (x) - x))
> >>> \
> >>> +      t1 = __builtin_##FFLOOR (x);
> >>> \
> >>> +    else
> >>> \
> >>> +      t1 = __builtin_##FCEIL (x);
> >>> \
> >>> +    if (t1 != t2)
> >>> \
> >>> +      link_error ();
> >>> \
> >>> +  }
> >>> \
> >>> +  void round_##FFLOOR##_2 (TYPE x)
> >>> \
> >>> +  {
> >>> \
> >>> +    TYPE t1 = 0;
> >>> \
> >>> +    TYPE t2 = __builtin_##FFLOOR (x + 0.5);
> >>> \
> >>> +    if ((__builtin_##FCEIL (x) - x) > (x - __builtin_##FFLOOR (x)))
> >>> \
> >>> +      t1 = __builtin_##FFLOOR (x);
> >>> \
> >>> +    else
> >>> \
> >>> +      t1 = __builtin_##FCEIL (x);
> >>> \
> >>> +    if (t1 != t2)
> >>> \
> >>> +      link_error ();
> >>> \
> >>> +  }
> >>> \
> >>> +  void round_##FFLOOR##_3 (TYPE x)
> >>> \
> >>> +  {
> >>> \
> >>> +    TYPE t1 = 0;
> >>> \
> >>> +    TYPE t2 = __builtin_##FFLOOR (x + 0.5);
> >>> \
> >>> +    if ((__builtin_##FCEIL (x) - x) <= (x - __builtin_##FFLOOR (x)))
> >>> \
> >>> +      t1 = __builtin_##FCEIL (x);
> >>> \
> >>> +    else
> >>> \
> >>> +      t1 = __builtin_##FFLOOR (x);
> >>> \
> >>> +    if (t1 != t2)
> >>> \
> >>> +      link_error ();
> >>> \
> >>> +  }
> >>> \
> >>> +  void round_##FFLOOR##_4 (TYPE x)
> >>> \
> >>> +  {
> >>> \
> >>> +    TYPE t1 = 0;
> >>> \
> >>> +    TYPE t2 = __builtin_##FFLOOR (x + 0.5);
> >>> \
> >>> +    if ((x - __builtin_##FFLOOR (x)) >= (__builtin_##FCEIL (x) - x))
> >>> \
> >>> +      t1 = __builtin_##FCEIL (x);
> >>> \
> >>> +    else
> >>> \
> >>> +      t1 = __builtin_##FFLOOR (x);
> >>> \
> >>> +    if (t1 != t2)
> >>> \
> >>> +      link_error ();
> >>> \
> >>> +  }
> >>> +
> >>> +TEST_ROUND (float, floorf, ceilf)
> >>> +TEST_ROUND (double, floor, ceil)
> >>> +TEST_ROUND (long double, floorl, ceill)
> >>> +
> >>> +/* { dg-final { scan-assembler-not "link_error" } } */
> >>> -- 
> >>> 2.20.1
> >>>
> 
> 
> 

-- 
Richard Biener <rguent...@suse.de>
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

Reply via email to