On Tue, 26 Aug 2025, Zhou Zhao wrote: > > 在 2025/8/26 下午6:52, Richard Biener 写道: > > On Tue, 26 Aug 2025, Zhou Zhao wrote: > > > >> 在 2025/8/26 下午3:37, Richard Biener 写道: > >>> On Thu, 21 Aug 2025, Zhou Zhao wrote: > >>> > >>>> This patch is a respond of the patch posted at > >>>> https://gcc.gnu.org/pipermail/gcc-patches/2025-January/673051.html > >>>> as some suggestion by Richard Biener, I have adopted these suggestions > >>>> and regenerated the patch. > >>>> > >>>> In the 538.imagick_r benchmark of Spec2017, I find these pattern from > >>>> MagickRound function. This patch implements these pattern in match.pd > >>>> for 4 rules under -funsafe-math-optimizations: > >>>> 1) (x-floor(x)) < (ceil(x)-x) ? floor(x) : ceil(x) -> floor(x+0.5) > >>>> 2) (x-floor(x)) >= (ceil(x)-x) ? ceil(x) : floor(x) -> floor(x+0.5) > >>>> 3) (ceil(x)-x) > (x-floor(x)) ? floor(x) : ceil(x) -> floor(x+0.5) > >>>> 4) (ceil(x)-x) <= (x-floor(x)) ? ceil(x) : floor(x) -> floor(x+0.5) > >>>> > >>>> The patch implements floor(x+0.5) operation to replace these pattern > >>>> that semantics of round(x) function. > >>> Can you say why you particularly chose floor (x + 0.5) as result > >>> while describing it to have the semantics of round (x)? The reasonable > >>> other choice is round(x) itself? > >>> > >>> What exact differences prompt you do gate this with > >>> -funsafe-math-optimizations? I can see signed zeros have > >>> different behavior, for x == -0.0 all forms in original form > >>> return -0.0 while the simplification will return 0.0. The behavior > >>> for Infs and NaNs looks unchanged. 0.5 and -0.5 seem to compute > >>> to the same value when using floor(x+0.5) as simplification (unless > >>> I made a mistake). floor or ceil do not raise IEEE exceptions, > >>> so wouldn't -fno-signed-zeros be enough as a gate? > >>> > >>> Thanks, > >>> Richard. > >>> > >> Thank you for your reply. The time interval since the last patch > >> submission might be too long, so I will re-describe our discussion on > >> the above issues: > >> > >> 1. I consider the round functions are round(x) and rint(x). In round > >> halfway cases, round(x) away from zero, rint(x) rounds x to the nearest > >> even integer. When the pattern input is x=-2.5, return -2.0, but > >> round(-2.5) return -3.0. When the pattern input is x=2.5, it return > >> 3.0, but rint(2.5) return 2.0. Therefore, using floor(x + 0.5) is the > >> best matches expression I think. Do you have any other functions with > >> semantics of round that could be used to represent this pattern? > > No, I think that covers it. rint() also is affected by the rounding > > mode so I think cannot be used here. > > > >> 2. As you mentioned, I need to add the -funsafe-math-optimizations > >> option to protect the (+0.5) operation. With the -Ofast option, which > >> enables -fno-signed-zeros, I observed that when x = -0.4, the pattern > >> returns -0.0 on aarch64-linux-gnu but returns 0.0 on x86_64-linux-gnu. > >> floor(x + 0.5) will return 0.0 on all the above targets. Additionally, > >> the pattern behaves the same for all double values, including INFs > >> and NaNs. > >> > >> 3. Indeed, I cannot guarantee that (+0.5) will always yield the > >> expected value, so I use -funsafe-math-optimizations for protection. > >> I think this is better than checking HONOR_NANS/INFS/SIGNED_ZEROS, > >> because when the true result of x + 0.5 cannot be exactly represented > >> in the target floating-point format, and happens to be halfway between > >> two adjacent floating-point numbers, different rounding rules will > >> make different choices, leading to different final results. > > I think 0.5 can be always exactly represented, but there might be > > a special 'x' where + 0.5 triggers a one-ulp difference depending > > on rounding mode. > > > > That said, HONOR_SIGN_DEPENDENT_ROUNDING might be also an issue > > because of that. > > > > But other than that the transform should be value-preserving? > yes, I think so. > > So I'd gate with !HONOR_SIGNED_ZEROS && !HONOR_SIGN_DEPENDENT_ROUNDING > > instead? > These conditions can indeed cover all the case I have considered, > I'm no issues with it. > > If no other issues, I will make the revisions and then resubmit > [patch v3] afterward.
Yes please, the rest looks OK. Richard. > Thanks > Zhou Zhao. > > Richard. > > > >> Thanks, > >> Zhou Zhao. > >>>> The patch was regtested on aarch64-linux-gnu and x86_64-linux-gnu, > >>>> SPEC 2017 and SPEC 2006 were run: > >>>> As for SPEC 2017, 538.imagick_r benchmark performance increased by 3%+ > >>>> in base test of ratio mode. > >>>> As for SPEC 2006, while the transform does not seem to be triggered, > >>>> we also see no non-noise impact on performance. > >>>> OK for mainline? > >>>> > >>>> gcc/ChangeLog: > >>>> > >>>> * match.pd: Add new pattern for round. > >>>> > >>>> gcc/testsuite/ChangeLog: > >>>> > >>>> * gcc.dg/fold-round-1.c: New test. > >>>> --- > >>>> gcc/match.pd | 17 +++++++++ > >>>> gcc/testsuite/gcc.dg/fold-round-1.c | 56 +++++++++++++++++++++++++++++ > >>>> 2 files changed, 73 insertions(+) > >>>> create mode 100644 gcc/testsuite/gcc.dg/fold-round-1.c > >>>> > >>>> diff --git a/gcc/match.pd b/gcc/match.pd > >>>> index 66e8a787449..94036603e08 100644 > >>>> --- a/gcc/match.pd > >>>> +++ b/gcc/match.pd > >>>> @@ -794,6 +794,23 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) > >>>> (rdiv @0 (negate @1)) > >>>> (rdiv (negate @0) @1)) > >>>> +(if (flag_unsafe_math_optimizations) > >>>> +/* convert semantics of round(x) function to floor(x+0.5). */ > >>>> +/* (x-floor(x)) < (ceil(x)-x) ? floor(x) : ceil(x) --> floor(x+0.5). */ > >>>> +/* (x-floor(x)) >= (ceil(x)-x) ? ceil(x) : floor(x) --> floor(x+0.5). > >>>> */ > >>>> +/* (ceil(x)-x) > (x-floor(x)) ? floor(x) : ceil(x) --> floor(x+0.5). */ > >>>> +/* (ceil(x)-x) <= (x-floor(x)) ? ceil(x) : floor(x) --> floor(x+0.5). > >>>> */ > >>>> +(for op (lt ge) > >>>> + bt (FLOOR CEIL) > >>>> + bf (CEIL FLOOR) > >>>> + floor (FLOOR FLOOR) > >>>> + ceil (CEIL CEIL) > >>>> + (simplify > >>>> + (cond (op:c (minus:s SSA_NAME@0 (floor SSA_NAME@0)) > >>>> + (minus:s (ceil SSA_NAME@0) SSA_NAME@0)) > >>>> + (bt SSA_NAME@0) (bf SSA_NAME@0)) > >>>> + (floor (plus @0 { build_real (type, dconsthalf); }))))) > >>>> + > >>>> (if (flag_unsafe_math_optimizations) > >>>> /* Simplify (C / x op 0.0) to x op 0.0 for C != 0, C != Inf/Nan. > >>>> Since C / x may underflow to zero, do this only for unsafe math. > >>>> */ > >>>> diff --git a/gcc/testsuite/gcc.dg/fold-round-1.c > >>>> b/gcc/testsuite/gcc.dg/fold-round-1.c > >>>> new file mode 100644 > >>>> index 00000000000..845d6d2e475 > >>>> --- /dev/null > >>>> +++ b/gcc/testsuite/gcc.dg/fold-round-1.c > >>>> @@ -0,0 +1,56 @@ > >>>> +/* { dg-do compile } */ > >>>> +/* { dg-options "-Ofast -funsafe-math-optimizations" } */ > >>>> + > >>>> +extern void link_error (void); > >>>> + > >>>> +#define TEST_ROUND(TYPE, FFLOOR, FCEIL) > >>>> \ > >>>> + void round_##FFLOOR##_1 (TYPE x) > >>>> \ > >>>> + { > >>>> \ > >>>> + TYPE t1 = 0; > >>>> \ > >>>> + TYPE t2 = __builtin_##FFLOOR (x + 0.5); > >>>> \ > >>>> + if ((x - __builtin_##FFLOOR (x)) < (__builtin_##FCEIL (x) - x)) > >>>> \ > >>>> + t1 = __builtin_##FFLOOR (x); > >>>> \ > >>>> + else > >>>> \ > >>>> + t1 = __builtin_##FCEIL (x); > >>>> \ > >>>> + if (t1 != t2) > >>>> \ > >>>> + link_error (); > >>>> \ > >>>> + } > >>>> \ > >>>> + void round_##FFLOOR##_2 (TYPE x) > >>>> \ > >>>> + { > >>>> \ > >>>> + TYPE t1 = 0; > >>>> \ > >>>> + TYPE t2 = __builtin_##FFLOOR (x + 0.5); > >>>> \ > >>>> + if ((__builtin_##FCEIL (x) - x) > (x - __builtin_##FFLOOR (x))) > >>>> \ > >>>> + t1 = __builtin_##FFLOOR (x); > >>>> \ > >>>> + else > >>>> \ > >>>> + t1 = __builtin_##FCEIL (x); > >>>> \ > >>>> + if (t1 != t2) > >>>> \ > >>>> + link_error (); > >>>> \ > >>>> + } > >>>> \ > >>>> + void round_##FFLOOR##_3 (TYPE x) > >>>> \ > >>>> + { > >>>> \ > >>>> + TYPE t1 = 0; > >>>> \ > >>>> + TYPE t2 = __builtin_##FFLOOR (x + 0.5); > >>>> \ > >>>> + if ((__builtin_##FCEIL (x) - x) <= (x - __builtin_##FFLOOR (x))) > >>>> \ > >>>> + t1 = __builtin_##FCEIL (x); > >>>> \ > >>>> + else > >>>> \ > >>>> + t1 = __builtin_##FFLOOR (x); > >>>> \ > >>>> + if (t1 != t2) > >>>> \ > >>>> + link_error (); > >>>> \ > >>>> + } > >>>> \ > >>>> + void round_##FFLOOR##_4 (TYPE x) > >>>> \ > >>>> + { > >>>> \ > >>>> + TYPE t1 = 0; > >>>> \ > >>>> + TYPE t2 = __builtin_##FFLOOR (x + 0.5); > >>>> \ > >>>> + if ((x - __builtin_##FFLOOR (x)) >= (__builtin_##FCEIL (x) - x)) > >>>> \ > >>>> + t1 = __builtin_##FCEIL (x); > >>>> \ > >>>> + else > >>>> \ > >>>> + t1 = __builtin_##FFLOOR (x); > >>>> \ > >>>> + if (t1 != t2) > >>>> \ > >>>> + link_error (); > >>>> \ > >>>> + } > >>>> + > >>>> +TEST_ROUND (float, floorf, ceilf) > >>>> +TEST_ROUND (double, floor, ceil) > >>>> +TEST_ROUND (long double, floorl, ceill) > >>>> + > >>>> +/* { dg-final { scan-assembler-not "link_error" } } */ > >>>> > >> > >> > > -- Richard Biener <rguent...@suse.de> SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg, Germany; GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)