On Thu, 9 Jan 2025, Zhou Zhao wrote: > > 在 2025/1/9 下午3:33, Richard Biener 写道: > > On Thu, 9 Jan 2025, Zhou Zhao wrote: > > > >> 在 2025/1/8 下午6:30, Richard Biener 写道: > >>> On Wed, 8 Jan 2025, Zhou Zhao wrote: > >>> > >>>> 在 2025/1/8 下午5:04, Richard Biener 写道: > >>>>> On Wed, 8 Jan 2025, Zhou Zhao wrote: > >>>>> > >>>>>> 在 2025/1/7 下午10:45, Richard Biener 写道: > >>>>>>> On Thu, 2 Jan 2025, 赵洲 wrote: > >>>>>>> > >>>>>>>> Add Reviewer Richard Biener. > >>>>>>>> > >>>>>>>> > >>>>>>>>> -----原始邮件----- > >>>>>>>>> 发件人: "Zhou Zhao" <zhaoz...@loongson.cn> > >>>>>>>>> 发送时间:2025-01-02 19:37:07 (星期四) > >>>>>>>>> 收件人: gcc-patches@gcc.gnu.org > >>>>>>>>> 抄送: xry...@xry111.site, i...@xen0n.name, chengl...@loongson.cn, > >>>>>>>>> xucheng...@loongson.cn, zhaoz...@loongson.cn > >>>>>>>>> 主题: [PATCH] match.pd: Fold pattern of round semantics. > >>>>>>>>> > >>>>>>>>> This patch implements 4 rules for semantics of round func in > >>>>>>>>> match.pd > >>>>>>>>> under > >>>>>>>>> -funsafe-math-optimizations: > >>>>>>>>> 1) (x-floor(x)) < (ceil(x)-x) ? floor(x) : ceil(x) -> floor(x+0.5) > >>>>>>>>> 2) (x-floor(x)) >= (ceil(x)-x) ? ceil(x) : floor(x) -> floor(x+0.5) > >>>>>>>>> 3) (ceil(x)-x) > (x-floor(x)) ? floor(x) : ceil(x) -> floor(x+0.5) > >>>>>>>>> 4) (ceil(x)-x) <= (x-floor(x)) ? ceil(x) : floor(x) -> floor(x+0.5) > >>>>>>>>> > >>>>>>>>> The patch implements floor(x+0.5) operation to replace semantics of > >>>>>>>>> round(x) function. > >>>>>>>>> The patch was regtested on aarch64-linux-gnu and x86_64-linux-gnu, > >>>>>>>>> SPEC > >>>>>>>>> 2017 and SPEC 2006 were run: > >>>>>>>>> As for SPEC 2017, 538.imagick_r benchmark performance increased by > >>>>>>>>> 3%+ > >>>>>>>>> in base test of ratio mode. > >>>>>>>>> As for SPEC 2006, while the transform does not seem to be triggered, > >>>>>>>>> we > >>>>>>>>> also see no non-noise impact on performance. > >>>>>>>>> OK for mainline? > >>>>>>>>> > >>>>>>>>> gcc/ChangeLog: > >>>>>>>>> > >>>>>>>>> * match.pd: Add new pattern for round. > >>>>>>>>> > >>>>>>>>> gcc/testsuite/ChangeLog: > >>>>>>>>> > >>>>>>>>> * gcc.dg/fold-round-1.c: New test. > >>>>>>>>> --- > >>>>>>>>> gcc/match.pd | 27 ++++++++++++++ > >>>>>>>>> gcc/testsuite/gcc.dg/fold-round-1.c | 56 > >>>>>>>>> +++++++++++++++++++++++++++++ > >>>>>>>>> 2 files changed, 83 insertions(+) > >>>>>>>>> create mode 100644 gcc/testsuite/gcc.dg/fold-round-1.c > >>>>>>>>> > >>>>>>>>> diff --git a/gcc/match.pd b/gcc/match.pd > >>>>>>>>> index 83eca8b2e0a..7b22b7913ac 100644 > >>>>>>>>> --- a/gcc/match.pd > >>>>>>>>> +++ b/gcc/match.pd > >>>>>>>>> @@ -777,6 +777,33 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) > >>>>>>>>> (rdiv @0 (negate @1)) > >>>>>>>>> (rdiv (negate @0) @1)) > >>>>>>>>> +(if (flag_unsafe_math_optimizations) > >>>>>>>>> +/* convert semantics of round(x) function to floor(x+0.5) */ > >>>>>>>>> +/* (x-floor(x)) < (ceil(x)-x) ? floor(x) : ceil(x) --> floor(x+0.5) > >>>>>>>>> */ > >>>>>>>>> +/* (x-floor(x)) >= (ceil(x)-x) ? ceil(x) : floor(x) --> > >>>>>>>>> floor(x+0.5) > >>>>>>>>> */ > >>>>>>>>> +/* (ceil(x)-x) > (x-floor(x)) ? floor(x) : ceil(x) --> floor(x+0.5) > >>>>>>>>> */ > >>>>>>>>> +/* (ceil(x)-x) <= (x-floor(x)) ? ceil(x) : floor(x) --> > >>>>>>>>> floor(x+0.5) > >>>>>>>>> */ > >>>>>> Hi, Richard > >>>>>> > >>>>>> Thanks for your reply. This patch is not fixes a bug. If we can discuss > >>>>>> > >>>>>> and reach a conclusion now, I will be re-posted/pinged when stage1 > >>>>>> opens > >>>>>> > >>>>>> for GCC 16. > >>>>>> > >>>>>> Zhou Zhao. > >>>>>>> I think you should be able to merge the cases for swapped compares > >>>>>>> by just matching (cond (op:c (...)). > >>>>>> If use swapped compares to merge the cases, the expression > >>>>>> > >>>>>> ((x - floor(x)) < (ceil(x) - x) ? floor(x) : ceil(x)) can be > >>>>>> transformed > >>>>>> to > >>>>>> > >>>>>> ((ceil(x) - x) < (x - floor(x)) ? floor(x) : ceil(x)), this is not the > >>>>>> expression > >>>>>> > >>>>>> I want, not be simplified to floor(x + 0.5). For example, when x = 1.4 > >>>>>> > >>>>>> the result of new expression is ceil(1.4)=2.0, but floor(1.4+0.5)=1.0. > >>>>>> > >>>>>> So I did not use swapped compares to merge the cases. > >>>>> The swapped compare will match > >>>>> > >>>>> (ceil(x) - x) > (x - floor(x) ? floor(x) : ceil(x) > >>>> Ah, you are correct. I try use the swapped compares merge the cases, it's > >>>> > >>>> consistent with what you said. I misunderstood the swapped comparison > >>>> symbols. > >>>> > >>>>>>> I do wonder about the compares itself though, (x-floor(x)) < > >>>>>>> (ceil(x)-x) > >>>>>>> isn't that true exactly when x > 0? With flag_unsafe_optimizations > >>>>>>> < vs. <= and thus x not having a fractional component would be > >>>>>>> unimportant. That said, wouldn't we expect the compares to be > >>>>>>> simplified? > >>>>>> I agree your said, this expression can be simplified when x not haveing > >>>>>> a > >>>>>> > >>>>>> fractional component, but it's difficult to determine whether x having > >>>>>> a > >>>>>> > >>>>>> fractional component. Maybe constrain x is int type and matching > >>>>>> > >>>>>> (floor (float@1 @0)) can simplify the expression. But that's not my > >>>>>> purpose, > >>>>>> > >>>>>> maybe the commits is not clear enough, I would rephrase it. > >>>>>> > >>>>>> We only need to focus on the first pattern,the other three are > >>>>>> different > >>>>>> > >>>>>> forms of it. I will use "left distance" to represent the distance from > >>>>>> x > >>>>>> > >>>>>> to its left integer boundary, "right distance" to represent the > >>>>>> distance > >>>>>> > >>>>>> from x to its right integer boundary. > >>>>>> > >>>>>> The pattern semantics as follows: > >>>>>> > >>>>>> as for x, if left distance is less than right distance, return > >>>>>> floor(x), > >>>>>> > >>>>>> if left distance is greater or equal than right distance, return > >>>>>> ceil(x). > >>>>>> > >>>>>> For example, when x=1.4, the left distance is less than right distance, > >>>>>> > >>>>>> return floor(x). (x+0.5) has not crossed the right boundary. > >>>>>> floor(x+0.5) > >>>>>> > >>>>>> is equal to floor(x). when x=1.5 or x=1.6, the left distance is not > >>>>>> less > >>>>>> > >>>>>> than right distance, return ceil(x). (x+0.5) has crossed the right > >>>>>> > >>>>>> boundary. floor(x+0.5) is equal to ceil(x). > >>>>> I see. So how does it relate to rint/nearbyint? > >>>> rint funciton is round halfway cases to the even integer. like > >>>> rint(0.5)=0.0, > >>>> > >>>> but this pattern of we mentioned is result 1.0. > >>> But you remove one rounding step on the compares and add one rounding on > >>> the replacement ( + 0.5) which is why you need to guard this with > >>> flag_unsafe_math_optimizations, so arguing the == x.5 case is different > >>> is futile? Or do you say you actually do not need to guard with > >>> flag_unsafe_math_optimizations because the behavior is exactly the > >>> same for all double values? (there's also Inf/Nan and signed zero to > >>> consider, but for those it's better to check HONOR_NANS/INFS/SIGNED_ZEROS) > > Seems I dropped the list in my last reply, adding back. > > > >> As you say, I need add the flag_unsafe_math_optimizations option to > >> > >> protect the (+0.5) operation. with the -Ofast option,enable > >> > >> -fno-signed-zeros, I observed when x=-0.4, the pattern would return > >> > >> -0.0 on aarch64-linux-gnu, return 0.0 on x86_64-linux-gnu. floor(x+0.5) > >> > >> will be return 0.0 on above all target. > > Well, with -fno-signed-zeros -0.0 and 0.0 are considered the same. > > > >> In addition, the pattern is the same for all double values, include > >> > >> INFS/NANS. Do you have any advice? Sincerely asking for your advice. > > Not really - so what you say is that you believe the transform is > > an identity transform when following all IEEE rules on the original > > expression besides the extra rounding step done for the + 0.5 operation. > > It's correct to protect the rounding change with > > -funsafe-math-optimizations (we don't have any better flag for this). > > > > The rounding might cause a difference for values where there's only > > a single bit for the fractional part of the number, so + 0.5 gets > > you always the next or the same whole number depending on the active > > rounding mode. But I'm not really a IEEE expert. > > > > Richard. > > Thank you for your explanation. Indeed, I cannot guarantee that (+0.5) > > will always get the expected value, so I use -funsafe-math-optimizations > > to protect. I think it's better than check the HONOR_NANS/INFS/SIGNED_ZEROS. > > If you have no objections to adding this flag, I will resubmit PATCH v2 > > when stage 1 opens for GCC 16.
Fine with me. Thanks, Richard. > > Zhou Zhao. > > >>>>>>> I'm also curious where you got these expressions to be simplified > >>>>>>> from? > >>>>>> In the 538.imagick_r benchmark of Spec2017, I find this pattern from > >>>>>> > >>>>>> MagickRound function. By simplifying the pattern, the performance of > >>>>>> the > >>>>>> > >>>>>> 538 benchmark can be improved by 3% to 4% on aarch64-linux-gnu and > >>>>>> > >>>>>> x86_64-linux-gnu. > >>>>> Ah, OK, can you mention this in the commit message? > >>>>> > >>>>> Thanks, > >>>>> Richard. > >>>> I will resubmit the [PATH v2] include the above mentioned commit message, > >>>> > >>>> and merge the cases for swapped compares by just matching (cond (op:c > >>>> (...)). > >>>> > >>>> > >>>> Thanks, > >>>> > >>>> Zhou Zhao. > >>>> > >>>>> > >>>>>>> Note we're in stage3 so unless this fixes a bug it isn't appropriate > >>>>>>> at this stage and should be re-posted/pinged when stage1 opens for > >>>>>>> GCC 16. > >>>>>>> > >>>>>>> Richard. > >>>>>>> > >>>>>>>>> +(for op (lt ge) > >>>>>>>>> + bt (FLOOR CEIL) > >>>>>>>>> + bf (CEIL FLOOR) > >>>>>>>>> + floor (FLOOR FLOOR) > >>>>>>>>> + ceil (CEIL CEIL) > >>>>>>>>> + (simplify > >>>>>>>>> + (cond (op (minus:s SSA_NAME@0 (floor SSA_NAME@0)) > >>>>>>>>> + (minus:s (ceil SSA_NAME@0) SSA_NAME@0)) > >>>>>>>>> + (bt SSA_NAME@0) (bf SSA_NAME@0)) > >>>>>>>>> + (floor (plus @0 { build_real (type, dconsthalf); })))) > >>>>>>>>> +(for op (gt le) > >>>>>>>>> + bt (FLOOR CEIL) > >>>>>>>>> + bf (CEIL FLOOR) > >>>>>>>>> + floor (FLOOR FLOOR) > >>>>>>>>> + ceil (CEIL CEIL) > >>>>>>>>> + (simplify > >>>>>>>>> + (cond (op (minus:s (ceil SSA_NAME@0) SSA_NAME@0) > >>>>>>>>> + (minus:s SSA_NAME@0 (floor SSA_NAME@0))) > >>>>>>>>> + (bt SSA_NAME@0) (bf SSA_NAME@0)) > >>>>>>>>> + (floor (plus @0 { build_real (type, dconsthalf); }))))) > >>>>>>>>> + > >>>>>>>>> (if (flag_unsafe_math_optimizations) > >>>>>>>>> /* Simplify (C / x op 0.0) to x op 0.0 for C != 0, C != > >>>>>>>>> Inf/Nan. > >>>>>>>>> Since C / x may underflow to zero, do this only for unsafe > >>>>>>>>> math. > >>>>>>>>> */ > >>>>>>>>> diff --git a/gcc/testsuite/gcc.dg/fold-round-1.c > >>>>>>>>> b/gcc/testsuite/gcc.dg/fold-round-1.c > >>>>>>>>> new file mode 100644 > >>>>>>>>> index 00000000000..845d6d2e475 > >>>>>>>>> --- /dev/null > >>>>>>>>> +++ b/gcc/testsuite/gcc.dg/fold-round-1.c > >>>>>>>>> @@ -0,0 +1,56 @@ > >>>>>>>>> +/* { dg-do compile } */ > >>>>>>>>> +/* { dg-options "-Ofast -funsafe-math-optimizations" } */ > >>>>>>>>> + > >>>>>>>>> +extern void link_error (void); > >>>>>>>>> + > >>>>>>>>> +#define TEST_ROUND(TYPE, FFLOOR, FCEIL) > >>>>>>>>> \ > >>>>>>>>> + void round_##FFLOOR##_1 (TYPE x) > >>>>>>>>> \ > >>>>>>>>> + { > >>>>>>>>> \ > >>>>>>>>> + TYPE t1 = 0; > >>>>>>>>> \ > >>>>>>>>> + TYPE t2 = __builtin_##FFLOOR (x + 0.5); > >>>>>>>>> \ > >>>>>>>>> + if ((x - __builtin_##FFLOOR (x)) < (__builtin_##FCEIL (x) - x)) > >>>>>>>>> \ > >>>>>>>>> + t1 = __builtin_##FFLOOR (x); > >>>>>>>>> \ > >>>>>>>>> + else > >>>>>>>>> \ > >>>>>>>>> + t1 = __builtin_##FCEIL (x); > >>>>>>>>> \ > >>>>>>>>> + if (t1 != t2) > >>>>>>>>> \ > >>>>>>>>> + link_error (); > >>>>>>>>> \ > >>>>>>>>> + } > >>>>>>>>> \ > >>>>>>>>> + void round_##FFLOOR##_2 (TYPE x) > >>>>>>>>> \ > >>>>>>>>> + { > >>>>>>>>> \ > >>>>>>>>> + TYPE t1 = 0; > >>>>>>>>> \ > >>>>>>>>> + TYPE t2 = __builtin_##FFLOOR (x + 0.5); > >>>>>>>>> \ > >>>>>>>>> + if ((__builtin_##FCEIL (x) - x) > (x - __builtin_##FFLOOR (x))) > >>>>>>>>> \ > >>>>>>>>> + t1 = __builtin_##FFLOOR (x); > >>>>>>>>> \ > >>>>>>>>> + else > >>>>>>>>> \ > >>>>>>>>> + t1 = __builtin_##FCEIL (x); > >>>>>>>>> \ > >>>>>>>>> + if (t1 != t2) > >>>>>>>>> \ > >>>>>>>>> + link_error (); > >>>>>>>>> \ > >>>>>>>>> + } > >>>>>>>>> \ > >>>>>>>>> + void round_##FFLOOR##_3 (TYPE x) > >>>>>>>>> \ > >>>>>>>>> + { > >>>>>>>>> \ > >>>>>>>>> + TYPE t1 = 0; > >>>>>>>>> \ > >>>>>>>>> + TYPE t2 = __builtin_##FFLOOR (x + 0.5); > >>>>>>>>> \ > >>>>>>>>> + if ((__builtin_##FCEIL (x) - x) <= (x - __builtin_##FFLOOR > >>>>>>>>> (x))) > >>>>>>>>> \ > >>>>>>>>> + t1 = __builtin_##FCEIL (x); > >>>>>>>>> \ > >>>>>>>>> + else > >>>>>>>>> \ > >>>>>>>>> + t1 = __builtin_##FFLOOR (x); > >>>>>>>>> \ > >>>>>>>>> + if (t1 != t2) > >>>>>>>>> \ > >>>>>>>>> + link_error (); > >>>>>>>>> \ > >>>>>>>>> + } > >>>>>>>>> \ > >>>>>>>>> + void round_##FFLOOR##_4 (TYPE x) > >>>>>>>>> \ > >>>>>>>>> + { > >>>>>>>>> \ > >>>>>>>>> + TYPE t1 = 0; > >>>>>>>>> \ > >>>>>>>>> + TYPE t2 = __builtin_##FFLOOR (x + 0.5); > >>>>>>>>> \ > >>>>>>>>> + if ((x - __builtin_##FFLOOR (x)) >= (__builtin_##FCEIL (x) - > >>>>>>>>> x)) > >>>>>>>>> \ > >>>>>>>>> + t1 = __builtin_##FCEIL (x); > >>>>>>>>> \ > >>>>>>>>> + else > >>>>>>>>> \ > >>>>>>>>> + t1 = __builtin_##FFLOOR (x); > >>>>>>>>> \ > >>>>>>>>> + if (t1 != t2) > >>>>>>>>> \ > >>>>>>>>> + link_error (); > >>>>>>>>> \ > >>>>>>>>> + } > >>>>>>>>> + > >>>>>>>>> +TEST_ROUND (float, floorf, ceilf) > >>>>>>>>> +TEST_ROUND (double, floor, ceil) > >>>>>>>>> +TEST_ROUND (long double, floorl, ceill) > >>>>>>>>> + > >>>>>>>>> +/* { dg-final { scan-assembler-not "link_error" } } */ > >>>>>>>>> -- > >>>>>>>>> 2.20.1 > >>>>>>>>> > >>>> > >> > > > -- Richard Biener <rguent...@suse.de> SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg, Germany; GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)