On Wed, 8 Jan 2025, Zhou Zhao wrote: > > 在 2025/1/7 下午10:45, Richard Biener 写道: > > On Thu, 2 Jan 2025, 赵洲 wrote: > > > >> Add Reviewer Richard Biener. > >> > >> > >>> -----原始邮件----- > >>> 发件人: "Zhou Zhao" <zhaoz...@loongson.cn> > >>> 发送时间:2025-01-02 19:37:07 (星期四) > >>> 收件人: gcc-patches@gcc.gnu.org > >>> 抄送: xry...@xry111.site, i...@xen0n.name, chengl...@loongson.cn, > >>> xucheng...@loongson.cn, zhaoz...@loongson.cn > >>> 主题: [PATCH] match.pd: Fold pattern of round semantics. > >>> > >>> This patch implements 4 rules for semantics of round func in match.pd > >>> under > >>> -funsafe-math-optimizations: > >>> 1) (x-floor(x)) < (ceil(x)-x) ? floor(x) : ceil(x) -> floor(x+0.5) > >>> 2) (x-floor(x)) >= (ceil(x)-x) ? ceil(x) : floor(x) -> floor(x+0.5) > >>> 3) (ceil(x)-x) > (x-floor(x)) ? floor(x) : ceil(x) -> floor(x+0.5) > >>> 4) (ceil(x)-x) <= (x-floor(x)) ? ceil(x) : floor(x) -> floor(x+0.5) > >>> > >>> The patch implements floor(x+0.5) operation to replace semantics of > >>> round(x) function. > >>> The patch was regtested on aarch64-linux-gnu and x86_64-linux-gnu, SPEC > >>> 2017 and SPEC 2006 were run: > >>> As for SPEC 2017, 538.imagick_r benchmark performance increased by 3%+ > >>> in base test of ratio mode. > >>> As for SPEC 2006, while the transform does not seem to be triggered, we > >>> also see no non-noise impact on performance. > >>> OK for mainline? > >>> > >>> gcc/ChangeLog: > >>> > >>> * match.pd: Add new pattern for round. > >>> > >>> gcc/testsuite/ChangeLog: > >>> > >>> * gcc.dg/fold-round-1.c: New test. > >>> --- > >>> gcc/match.pd | 27 ++++++++++++++ > >>> gcc/testsuite/gcc.dg/fold-round-1.c | 56 +++++++++++++++++++++++++++++ > >>> 2 files changed, 83 insertions(+) > >>> create mode 100644 gcc/testsuite/gcc.dg/fold-round-1.c > >>> > >>> diff --git a/gcc/match.pd b/gcc/match.pd > >>> index 83eca8b2e0a..7b22b7913ac 100644 > >>> --- a/gcc/match.pd > >>> +++ b/gcc/match.pd > >>> @@ -777,6 +777,33 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) > >>> (rdiv @0 (negate @1)) > >>> (rdiv (negate @0) @1)) > >>> +(if (flag_unsafe_math_optimizations) > >>> +/* convert semantics of round(x) function to floor(x+0.5) */ > >>> +/* (x-floor(x)) < (ceil(x)-x) ? floor(x) : ceil(x) --> floor(x+0.5) */ > >>> +/* (x-floor(x)) >= (ceil(x)-x) ? ceil(x) : floor(x) --> floor(x+0.5) */ > >>> +/* (ceil(x)-x) > (x-floor(x)) ? floor(x) : ceil(x) --> floor(x+0.5) */ > >>> +/* (ceil(x)-x) <= (x-floor(x)) ? ceil(x) : floor(x) --> floor(x+0.5) */ > > Hi, Richard > > Thanks for your reply. This patch is not fixes a bug. If we can discuss > > and reach a conclusion now, I will be re-posted/pinged when stage1 opens > > for GCC 16. > > Zhou Zhao. > > I think you should be able to merge the cases for swapped compares > > by just matching (cond (op:c (...)). > > If use swapped compares to merge the cases, the expression > > ((x - floor(x)) < (ceil(x) - x) ? floor(x) : ceil(x)) can be transformed to > > ((ceil(x) - x) < (x - floor(x)) ? floor(x) : ceil(x)), this is not the > expression > > I want, not be simplified to floor(x + 0.5). For example, when x = 1.4 > > the result of new expression is ceil(1.4)=2.0, but floor(1.4+0.5)=1.0. > > So I did not use swapped compares to merge the cases.
The swapped compare will match (ceil(x) - x) > (x - floor(x) ? floor(x) : ceil(x) > > I do wonder about the compares itself though, (x-floor(x)) < (ceil(x)-x) > > isn't that true exactly when x > 0? With flag_unsafe_optimizations > > < vs. <= and thus x not having a fractional component would be > > unimportant. That said, wouldn't we expect the compares to be > > simplified? > > I agree your said, this expression can be simplified when x not haveing a > > fractional component, but it's difficult to determine whether x having a > > fractional component. Maybe constrain x is int type and matching > > (floor (float@1 @0)) can simplify the expression. But that's not my purpose, > > maybe the commits is not clear enough, I would rephrase it. > > We only need to focus on the first pattern,the other three are different > > forms of it. I will use "left distance" to represent the distance from x > > to its left integer boundary, "right distance" to represent the distance > > from x to its right integer boundary. > > The pattern semantics as follows: > > as for x, if left distance is less than right distance, return floor(x), > > if left distance is greater or equal than right distance, return ceil(x). > > For example, when x=1.4, the left distance is less than right distance, > > return floor(x). (x+0.5) has not crossed the right boundary. floor(x+0.5) > > is equal to floor(x). when x=1.5 or x=1.6, the left distance is not less > > than right distance, return ceil(x). (x+0.5) has crossed the right > > boundary. floor(x+0.5) is equal to ceil(x). I see. So how does it relate to rint/nearbyint? > > I'm also curious where you got these expressions to be simplified from? > > In the 538.imagick_r benchmark of Spec2017, I find this pattern from > > MagickRound function. By simplifying the pattern, the performance of the > > 538 benchmark can be improved by 3% to 4% on aarch64-linux-gnu and > > x86_64-linux-gnu. Ah, OK, can you mention this in the commit message? Thanks, Richard. > > Note we're in stage3 so unless this fixes a bug it isn't appropriate > > at this stage and should be re-posted/pinged when stage1 opens for > > GCC 16. > > > > Richard. > > > >>> +(for op (lt ge) > >>> + bt (FLOOR CEIL) > >>> + bf (CEIL FLOOR) > >>> + floor (FLOOR FLOOR) > >>> + ceil (CEIL CEIL) > >>> + (simplify > >>> + (cond (op (minus:s SSA_NAME@0 (floor SSA_NAME@0)) > >>> + (minus:s (ceil SSA_NAME@0) SSA_NAME@0)) > >>> + (bt SSA_NAME@0) (bf SSA_NAME@0)) > >>> + (floor (plus @0 { build_real (type, dconsthalf); })))) > >>> +(for op (gt le) > >>> + bt (FLOOR CEIL) > >>> + bf (CEIL FLOOR) > >>> + floor (FLOOR FLOOR) > >>> + ceil (CEIL CEIL) > >>> + (simplify > >>> + (cond (op (minus:s (ceil SSA_NAME@0) SSA_NAME@0) > >>> + (minus:s SSA_NAME@0 (floor SSA_NAME@0))) > >>> + (bt SSA_NAME@0) (bf SSA_NAME@0)) > >>> + (floor (plus @0 { build_real (type, dconsthalf); }))))) > >>> + > >>> (if (flag_unsafe_math_optimizations) > >>> /* Simplify (C / x op 0.0) to x op 0.0 for C != 0, C != Inf/Nan. > >>> Since C / x may underflow to zero, do this only for unsafe math. */ > >>> diff --git a/gcc/testsuite/gcc.dg/fold-round-1.c > >>> b/gcc/testsuite/gcc.dg/fold-round-1.c > >>> new file mode 100644 > >>> index 00000000000..845d6d2e475 > >>> --- /dev/null > >>> +++ b/gcc/testsuite/gcc.dg/fold-round-1.c > >>> @@ -0,0 +1,56 @@ > >>> +/* { dg-do compile } */ > >>> +/* { dg-options "-Ofast -funsafe-math-optimizations" } */ > >>> + > >>> +extern void link_error (void); > >>> + > >>> +#define TEST_ROUND(TYPE, FFLOOR, FCEIL) > >>> \ > >>> + void round_##FFLOOR##_1 (TYPE x) > >>> \ > >>> + { > >>> \ > >>> + TYPE t1 = 0; > >>> \ > >>> + TYPE t2 = __builtin_##FFLOOR (x + 0.5); > >>> \ > >>> + if ((x - __builtin_##FFLOOR (x)) < (__builtin_##FCEIL (x) - x)) > >>> \ > >>> + t1 = __builtin_##FFLOOR (x); > >>> \ > >>> + else > >>> \ > >>> + t1 = __builtin_##FCEIL (x); > >>> \ > >>> + if (t1 != t2) > >>> \ > >>> + link_error (); > >>> \ > >>> + } > >>> \ > >>> + void round_##FFLOOR##_2 (TYPE x) > >>> \ > >>> + { > >>> \ > >>> + TYPE t1 = 0; > >>> \ > >>> + TYPE t2 = __builtin_##FFLOOR (x + 0.5); > >>> \ > >>> + if ((__builtin_##FCEIL (x) - x) > (x - __builtin_##FFLOOR (x))) > >>> \ > >>> + t1 = __builtin_##FFLOOR (x); > >>> \ > >>> + else > >>> \ > >>> + t1 = __builtin_##FCEIL (x); > >>> \ > >>> + if (t1 != t2) > >>> \ > >>> + link_error (); > >>> \ > >>> + } > >>> \ > >>> + void round_##FFLOOR##_3 (TYPE x) > >>> \ > >>> + { > >>> \ > >>> + TYPE t1 = 0; > >>> \ > >>> + TYPE t2 = __builtin_##FFLOOR (x + 0.5); > >>> \ > >>> + if ((__builtin_##FCEIL (x) - x) <= (x - __builtin_##FFLOOR (x))) > >>> \ > >>> + t1 = __builtin_##FCEIL (x); > >>> \ > >>> + else > >>> \ > >>> + t1 = __builtin_##FFLOOR (x); > >>> \ > >>> + if (t1 != t2) > >>> \ > >>> + link_error (); > >>> \ > >>> + } > >>> \ > >>> + void round_##FFLOOR##_4 (TYPE x) > >>> \ > >>> + { > >>> \ > >>> + TYPE t1 = 0; > >>> \ > >>> + TYPE t2 = __builtin_##FFLOOR (x + 0.5); > >>> \ > >>> + if ((x - __builtin_##FFLOOR (x)) >= (__builtin_##FCEIL (x) - x)) > >>> \ > >>> + t1 = __builtin_##FCEIL (x); > >>> \ > >>> + else > >>> \ > >>> + t1 = __builtin_##FFLOOR (x); > >>> \ > >>> + if (t1 != t2) > >>> \ > >>> + link_error (); > >>> \ > >>> + } > >>> + > >>> +TEST_ROUND (float, floorf, ceilf) > >>> +TEST_ROUND (double, floor, ceil) > >>> +TEST_ROUND (long double, floorl, ceill) > >>> + > >>> +/* { dg-final { scan-assembler-not "link_error" } } */ > >>> -- > >>> 2.20.1 > >>> > > > -- Richard Biener <rguent...@suse.de> SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg, Germany; GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)