Re: [PATCH] match.pd: Fold pattern of round semantics.

Richard Biener Thu, 09 Jan 2025 01:18:49 -0800

On Thu, 9 Jan 2025, Zhou Zhao wrote:

> 
> 在 2025/1/9 下午3:33, Richard Biener 写道:
> > On Thu, 9 Jan 2025, Zhou Zhao wrote:
> >
> >> 在 2025/1/8 下午6:30, Richard Biener 写道:
> >>> On Wed, 8 Jan 2025, Zhou Zhao wrote:
> >>>
> >>>> 在 2025/1/8 下午5:04, Richard Biener 写道:
> >>>>> On Wed, 8 Jan 2025, Zhou Zhao wrote:
> >>>>>
> >>>>>> 在 2025/1/7 下午10:45, Richard Biener 写道:
> >>>>>>> On Thu, 2 Jan 2025, 赵洲 wrote:
> >>>>>>>
> >>>>>>>> Add Reviewer Richard Biener.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>> -----原始邮件-----
> >>>>>>>>> 发件人: "Zhou Zhao" <zhaoz...@loongson.cn>
> >>>>>>>>> 发送时间:2025-01-02 19:37:07 (星期四)
> >>>>>>>>> 收件人: gcc-patches@gcc.gnu.org
> >>>>>>>>> 抄送: xry...@xry111.site, i...@xen0n.name, chengl...@loongson.cn,
> >>>>>>>>> xucheng...@loongson.cn, zhaoz...@loongson.cn
> >>>>>>>>> 主题: [PATCH] match.pd: Fold pattern of round semantics.
> >>>>>>>>>
> >>>>>>>>> This patch implements 4 rules for semantics of round func in
> >>>>>>>>> match.pd
> >>>>>>>>> under
> >>>>>>>>> -funsafe-math-optimizations:
> >>>>>>>>> 1) (x-floor(x)) < (ceil(x)-x) ? floor(x) : ceil(x) -> floor(x+0.5)
> >>>>>>>>> 2) (x-floor(x)) >= (ceil(x)-x) ? ceil(x) : floor(x) -> floor(x+0.5)
> >>>>>>>>> 3) (ceil(x)-x) > (x-floor(x)) ? floor(x) : ceil(x) -> floor(x+0.5)
> >>>>>>>>> 4) (ceil(x)-x) <= (x-floor(x)) ? ceil(x) : floor(x) -> floor(x+0.5)
> >>>>>>>>>
> >>>>>>>>> The patch implements floor(x+0.5) operation to replace semantics of
> >>>>>>>>> round(x) function.
> >>>>>>>>> The patch was regtested on aarch64-linux-gnu and x86_64-linux-gnu,
> >>>>>>>>> SPEC
> >>>>>>>>> 2017 and SPEC 2006 were run:
> >>>>>>>>> As for SPEC 2017, 538.imagick_r benchmark performance increased by
> >>>>>>>>> 3%+
> >>>>>>>>> in base test of ratio mode.
> >>>>>>>>> As for SPEC 2006, while the transform does not seem to be triggered,
> >>>>>>>>> we
> >>>>>>>>> also see no non-noise impact on performance.
> >>>>>>>>> OK for mainline?
> >>>>>>>>>
> >>>>>>>>> gcc/ChangeLog:
> >>>>>>>>>
> >>>>>>>>>     * match.pd: Add new pattern for round.
> >>>>>>>>>
> >>>>>>>>> gcc/testsuite/ChangeLog:
> >>>>>>>>>
> >>>>>>>>>     * gcc.dg/fold-round-1.c: New test.
> >>>>>>>>> ---
> >>>>>>>>>      gcc/match.pd                        | 27 ++++++++++++++
> >>>>>>>>>      gcc/testsuite/gcc.dg/fold-round-1.c | 56
> >>>>>>>>>      +++++++++++++++++++++++++++++
> >>>>>>>>>      2 files changed, 83 insertions(+)
> >>>>>>>>>      create mode 100644 gcc/testsuite/gcc.dg/fold-round-1.c
> >>>>>>>>>
> >>>>>>>>> diff --git a/gcc/match.pd b/gcc/match.pd
> >>>>>>>>> index 83eca8b2e0a..7b22b7913ac 100644
> >>>>>>>>> --- a/gcc/match.pd
> >>>>>>>>> +++ b/gcc/match.pd
> >>>>>>>>> @@ -777,6 +777,33 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> >>>>>>>>>       (rdiv @0 (negate @1))
> >>>>>>>>>       (rdiv (negate @0) @1))
> >>>>>>>>>      +(if (flag_unsafe_math_optimizations)
> >>>>>>>>> +/* convert semantics of round(x) function to floor(x+0.5) */
> >>>>>>>>> +/* (x-floor(x)) < (ceil(x)-x) ? floor(x) : ceil(x) --> floor(x+0.5)
> >>>>>>>>> */
> >>>>>>>>> +/* (x-floor(x)) >= (ceil(x)-x) ? ceil(x) : floor(x) -->
> >>>>>>>>> floor(x+0.5)
> >>>>>>>>> */
> >>>>>>>>> +/* (ceil(x)-x) > (x-floor(x)) ? floor(x) : ceil(x) --> floor(x+0.5)
> >>>>>>>>> */
> >>>>>>>>> +/* (ceil(x)-x) <= (x-floor(x)) ? ceil(x) : floor(x) -->
> >>>>>>>>> floor(x+0.5)
> >>>>>>>>> */
> >>>>>> Hi, Richard
> >>>>>>
> >>>>>> Thanks for your reply. This patch is not fixes a bug. If we can discuss
> >>>>>>
> >>>>>> and reach a conclusion now, I will be re-posted/pinged when stage1
> >>>>>> opens
> >>>>>>
> >>>>>> for GCC 16.
> >>>>>>
> >>>>>> Zhou Zhao.
> >>>>>>> I think you should be able to merge the cases for swapped compares
> >>>>>>> by just matching (cond (op:c (...)).
> >>>>>> If use swapped compares to merge the cases, the expression
> >>>>>>
> >>>>>> ((x - floor(x)) < (ceil(x) - x) ? floor(x) : ceil(x)) can be
> >>>>>> transformed
> >>>>>> to
> >>>>>>
> >>>>>> ((ceil(x) - x) < (x - floor(x)) ? floor(x) : ceil(x)), this is not the
> >>>>>> expression
> >>>>>>
> >>>>>> I want, not be simplified to floor(x + 0.5). For example, when x = 1.4
> >>>>>>
> >>>>>> the result of new expression is ceil(1.4)=2.0, but floor(1.4+0.5)=1.0.
> >>>>>>
> >>>>>> So I did not use swapped compares to merge the cases.
> >>>>> The swapped compare will match
> >>>>>
> >>>>> (ceil(x) - x) > (x - floor(x) ? floor(x) : ceil(x)
> >>>> Ah, you are correct. I try use the swapped compares merge the cases, it's
> >>>>
> >>>> consistent with what you said. I misunderstood the swapped comparison
> >>>> symbols.
> >>>>
> >>>>>>> I do wonder about the compares itself though, (x-floor(x)) <
> >>>>>>> (ceil(x)-x)
> >>>>>>> isn't that true exactly when x > 0?  With flag_unsafe_optimizations
> >>>>>>> < vs. <= and thus x not having a fractional component would be
> >>>>>>> unimportant.  That said, wouldn't we expect the compares to be
> >>>>>>> simplified?
> >>>>>> I agree your said, this expression can be simplified when x not haveing
> >>>>>> a
> >>>>>>
> >>>>>> fractional component, but it's difficult to determine whether x having
> >>>>>> a
> >>>>>>
> >>>>>> fractional component. Maybe constrain x is int type and matching
> >>>>>>
> >>>>>> (floor (float@1 @0)) can simplify the expression. But that's not my
> >>>>>> purpose,
> >>>>>>
> >>>>>> maybe the commits is not clear enough, I would rephrase it.
> >>>>>>
> >>>>>> We only need to focus on the first pattern，the other three are
> >>>>>> different
> >>>>>>
> >>>>>> forms of it. I will use "left distance" to represent the distance from
> >>>>>> x
> >>>>>>
> >>>>>> to its left integer boundary, "right distance" to represent the
> >>>>>> distance
> >>>>>>
> >>>>>> from x to its right integer boundary.
> >>>>>>
> >>>>>> The pattern semantics as follows:
> >>>>>>
> >>>>>> as for x, if left distance is less than right distance, return
> >>>>>> floor(x),
> >>>>>>
> >>>>>> if left distance is greater or equal than right distance, return
> >>>>>> ceil(x).
> >>>>>>
> >>>>>> For example, when x=1.4, the left distance is less than right distance,
> >>>>>>
> >>>>>> return floor(x). (x+0.5) has not crossed the right boundary.
> >>>>>> floor(x+0.5)
> >>>>>>
> >>>>>> is equal to floor(x). when x=1.5 or x=1.6, the left distance is not
> >>>>>> less
> >>>>>>
> >>>>>> than right distance, return ceil(x). (x+0.5) has crossed the right
> >>>>>>
> >>>>>> boundary. floor(x+0.5) is equal to ceil(x).
> >>>>> I see.  So how does it relate to rint/nearbyint?
> >>>> rint funciton is round halfway cases to the even integer. like
> >>>> rint(0.5)=0.0,
> >>>>
> >>>> but this pattern of we mentioned is result 1.0.
> >>> But you remove one rounding step on the compares and add one rounding on
> >>> the replacement ( + 0.5) which is why you need to guard this with
> >>> flag_unsafe_math_optimizations, so arguing the == x.5 case is different
> >>> is futile?  Or do you say you actually do not need to guard with
> >>> flag_unsafe_math_optimizations because the behavior is exactly the
> >>> same for all double values?  (there's also Inf/Nan and signed zero to
> >>> consider, but for those it's better to check HONOR_NANS/INFS/SIGNED_ZEROS)
> > Seems I dropped the list in my last reply, adding back.
> >
> >> As you say, I need add the flag_unsafe_math_optimizations option to
> >>
> >> protect the (+0.5) operation. with the -Ofast option，enable
> >>
> >> -fno-signed-zeros, I observed when x=-0.4, the pattern would return
> >>
> >> -0.0 on aarch64-linux-gnu, return 0.0 on x86_64-linux-gnu. floor(x+0.5)
> >>
> >> will be return 0.0 on above all target.
> > Well, with -fno-signed-zeros -0.0 and 0.0 are considered the same.
> >
> >> In addition, the pattern is the same for all double values, include
> >>
> >> INFS/NANS. Do you have any advice? Sincerely asking for your advice.
> > Not really - so what you say is that you believe the transform is
> > an identity transform when following all IEEE rules on the original
> > expression besides the extra rounding step done for the + 0.5 operation.
> > It's correct to protect the rounding change with
> > -funsafe-math-optimizations (we don't have any better flag for this).
> >
> > The rounding might cause a difference for values where there's only
> > a single bit for the fractional part of the number, so + 0.5 gets
> > you always the next or the same whole number depending on the active
> > rounding mode.  But I'm not really a IEEE expert.
> >
> > Richard.
> 
> Thank you for your explanation. Indeed, I cannot guarantee that (+0.5)
> 
> will always get the expected value, so I use -funsafe-math-optimizations
> 
> to protect. I think it's better than check the HONOR_NANS/INFS/SIGNED_ZEROS.
> 
> If you have no objections to adding this flag, I will resubmit PATCH v2
> 
> when stage 1 opens for GCC 16.


Fine with me.

Thanks,
Richard.

> 
> Zhou Zhao.
> 
> >>>>>>> I'm also curious where you got these expressions to be simplified
> >>>>>>> from?
> >>>>>> In the 538.imagick_r benchmark of Spec2017, I find this pattern from
> >>>>>>
> >>>>>> MagickRound function. By simplifying the pattern, the performance of
> >>>>>> the
> >>>>>>
> >>>>>> 538 benchmark can be improved by 3% to 4% on aarch64-linux-gnu and
> >>>>>>
> >>>>>> x86_64-linux-gnu.
> >>>>> Ah, OK, can you mention this in the commit message?
> >>>>>
> >>>>> Thanks,
> >>>>> Richard.
> >>>> I will resubmit the [PATH v2] include the above mentioned commit message,
> >>>>
> >>>> and merge the cases for swapped compares by just matching (cond (op:c
> >>>> (...)).
> >>>>
> >>>>
> >>>> Thanks,
> >>>>
> >>>> Zhou Zhao.
> >>>>
> >>>>>     
> >>>>>>> Note we're in stage3 so unless this fixes a bug it isn't appropriate
> >>>>>>> at this stage and should be re-posted/pinged when stage1 opens for
> >>>>>>> GCC 16.
> >>>>>>>
> >>>>>>> Richard.
> >>>>>>>
> >>>>>>>>> +(for op (lt ge)
> >>>>>>>>> +     bt (FLOOR CEIL)
> >>>>>>>>> +     bf (CEIL FLOOR)
> >>>>>>>>> +     floor (FLOOR FLOOR)
> >>>>>>>>> +     ceil (CEIL CEIL)
> >>>>>>>>> + (simplify
> >>>>>>>>> +  (cond (op (minus:s SSA_NAME@0 (floor SSA_NAME@0))
> >>>>>>>>> +       (minus:s (ceil SSA_NAME@0) SSA_NAME@0))
> >>>>>>>>> +   (bt SSA_NAME@0) (bf SSA_NAME@0))
> >>>>>>>>> +  (floor (plus @0 { build_real (type, dconsthalf); }))))
> >>>>>>>>> +(for op (gt le)
> >>>>>>>>> +     bt (FLOOR CEIL)
> >>>>>>>>> +     bf (CEIL FLOOR)
> >>>>>>>>> +     floor (FLOOR FLOOR)
> >>>>>>>>> +     ceil (CEIL CEIL)
> >>>>>>>>> + (simplify
> >>>>>>>>> +  (cond (op (minus:s (ceil SSA_NAME@0) SSA_NAME@0)
> >>>>>>>>> +       (minus:s SSA_NAME@0 (floor SSA_NAME@0)))
> >>>>>>>>> +   (bt SSA_NAME@0) (bf SSA_NAME@0))
> >>>>>>>>> +  (floor (plus @0 { build_real (type, dconsthalf); })))))
> >>>>>>>>> +
> >>>>>>>>>      (if (flag_unsafe_math_optimizations)
> >>>>>>>>>       /* Simplify (C / x op 0.0) to x op 0.0 for C != 0, C !=
> >>>>>>>>>       Inf/Nan.
> >>>>>>>>>          Since C / x may underflow to zero, do this only for unsafe
> >>>>>>>>> math.
> >>>>>>>>> */
> >>>>>>>>> diff --git a/gcc/testsuite/gcc.dg/fold-round-1.c
> >>>>>>>>> b/gcc/testsuite/gcc.dg/fold-round-1.c
> >>>>>>>>> new file mode 100644
> >>>>>>>>> index 00000000000..845d6d2e475
> >>>>>>>>> --- /dev/null
> >>>>>>>>> +++ b/gcc/testsuite/gcc.dg/fold-round-1.c
> >>>>>>>>> @@ -0,0 +1,56 @@
> >>>>>>>>> +/* { dg-do compile } */
> >>>>>>>>> +/* { dg-options "-Ofast -funsafe-math-optimizations" } */
> >>>>>>>>> +
> >>>>>>>>> +extern void link_error (void);
> >>>>>>>>> +
> >>>>>>>>> +#define TEST_ROUND(TYPE, FFLOOR, FCEIL)
> >>>>>>>>> \
> >>>>>>>>> +  void round_##FFLOOR##_1 (TYPE x)
> >>>>>>>>> \
> >>>>>>>>> +  {
> >>>>>>>>> \
> >>>>>>>>> +    TYPE t1 = 0;
> >>>>>>>>> \
> >>>>>>>>> +    TYPE t2 = __builtin_##FFLOOR (x + 0.5);
> >>>>>>>>> \
> >>>>>>>>> +    if ((x - __builtin_##FFLOOR (x)) < (__builtin_##FCEIL (x) - x))
> >>>>>>>>> \
> >>>>>>>>> +      t1 = __builtin_##FFLOOR (x);
> >>>>>>>>> \
> >>>>>>>>> +    else
> >>>>>>>>> \
> >>>>>>>>> +      t1 = __builtin_##FCEIL (x);
> >>>>>>>>> \
> >>>>>>>>> +    if (t1 != t2)
> >>>>>>>>> \
> >>>>>>>>> +      link_error ();
> >>>>>>>>> \
> >>>>>>>>> +  }
> >>>>>>>>> \
> >>>>>>>>> +  void round_##FFLOOR##_2 (TYPE x)
> >>>>>>>>> \
> >>>>>>>>> +  {
> >>>>>>>>> \
> >>>>>>>>> +    TYPE t1 = 0;
> >>>>>>>>> \
> >>>>>>>>> +    TYPE t2 = __builtin_##FFLOOR (x + 0.5);
> >>>>>>>>> \
> >>>>>>>>> +    if ((__builtin_##FCEIL (x) - x) > (x - __builtin_##FFLOOR (x)))
> >>>>>>>>> \
> >>>>>>>>> +      t1 = __builtin_##FFLOOR (x);
> >>>>>>>>> \
> >>>>>>>>> +    else
> >>>>>>>>> \
> >>>>>>>>> +      t1 = __builtin_##FCEIL (x);
> >>>>>>>>> \
> >>>>>>>>> +    if (t1 != t2)
> >>>>>>>>> \
> >>>>>>>>> +      link_error ();
> >>>>>>>>> \
> >>>>>>>>> +  }
> >>>>>>>>> \
> >>>>>>>>> +  void round_##FFLOOR##_3 (TYPE x)
> >>>>>>>>> \
> >>>>>>>>> +  {
> >>>>>>>>> \
> >>>>>>>>> +    TYPE t1 = 0;
> >>>>>>>>> \
> >>>>>>>>> +    TYPE t2 = __builtin_##FFLOOR (x + 0.5);
> >>>>>>>>> \
> >>>>>>>>> +    if ((__builtin_##FCEIL (x) - x) <= (x - __builtin_##FFLOOR
> >>>>>>>>> (x)))
> >>>>>>>>> \
> >>>>>>>>> +      t1 = __builtin_##FCEIL (x);
> >>>>>>>>> \
> >>>>>>>>> +    else
> >>>>>>>>> \
> >>>>>>>>> +      t1 = __builtin_##FFLOOR (x);
> >>>>>>>>> \
> >>>>>>>>> +    if (t1 != t2)
> >>>>>>>>> \
> >>>>>>>>> +      link_error ();
> >>>>>>>>> \
> >>>>>>>>> +  }
> >>>>>>>>> \
> >>>>>>>>> +  void round_##FFLOOR##_4 (TYPE x)
> >>>>>>>>> \
> >>>>>>>>> +  {
> >>>>>>>>> \
> >>>>>>>>> +    TYPE t1 = 0;
> >>>>>>>>> \
> >>>>>>>>> +    TYPE t2 = __builtin_##FFLOOR (x + 0.5);
> >>>>>>>>> \
> >>>>>>>>> +    if ((x - __builtin_##FFLOOR (x)) >= (__builtin_##FCEIL (x) -
> >>>>>>>>> x))
> >>>>>>>>> \
> >>>>>>>>> +      t1 = __builtin_##FCEIL (x);
> >>>>>>>>> \
> >>>>>>>>> +    else
> >>>>>>>>> \
> >>>>>>>>> +      t1 = __builtin_##FFLOOR (x);
> >>>>>>>>> \
> >>>>>>>>> +    if (t1 != t2)
> >>>>>>>>> \
> >>>>>>>>> +      link_error ();
> >>>>>>>>> \
> >>>>>>>>> +  }
> >>>>>>>>> +
> >>>>>>>>> +TEST_ROUND (float, floorf, ceilf)
> >>>>>>>>> +TEST_ROUND (double, floor, ceil)
> >>>>>>>>> +TEST_ROUND (long double, floorl, ceill)
> >>>>>>>>> +
> >>>>>>>>> +/* { dg-final { scan-assembler-not "link_error" } } */
> >>>>>>>>> -- 
> >>>>>>>>> 2.20.1
> >>>>>>>>>
> >>>>
> >>
> 
> 
> 

-- 
Richard Biener <rguent...@suse.de>
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

Re: [PATCH] match.pd: Fold pattern of round semantics.

Reply via email to