> On 8 Oct 2024, at 10:44, Richard Biener <rguent...@suse.de> wrote: > > External email: Use caution opening links or attachments > > > On Thu, 3 Oct 2024, Jennifer Schmitz wrote: > >> >> >>> On 1 Oct 2024, at 14:27, Richard Biener <rguent...@suse.de> wrote: >>> >>> External email: Use caution opening links or attachments >>> >>> >>> On Tue, 1 Oct 2024, Jennifer Schmitz wrote: >>> >>>> This patch implements 4 rules for logarithmic identities in match.pd >>>> under -funsafe-math-optimizations: >>>> 1) logN(1.0/a) -> -logN(a). This avoids the division instruction. >>>> 2) logN(C/a) -> logN(C) - logN(a), where C is a real constant. Same as 1). >>>> 3) logN(a) + logN(b) -> logN(a*b). This reduces the number of calls to >>>> log function. >>>> 4) logN(a) - logN(b) -> logN(a/b). Same as 4). >>>> Tests were added for float, double, and long double. >>>> >>>> The patch was bootstrapped and regtested on aarch64-linux-gnu and >>>> x86_64-linux-gnu, no regression. >>>> Additionally, SPEC 2017 fprate was run. While the transform does not seem >>>> to be triggered, we also see no non-noise impact on performance. >>>> OK for mainline? >>> >>> Since log can set errno we have the builtins affect global memory and >>> thus have VDEFs, this posses issues for match.pd which does not assign >>> new VDEFs upon materializing the result, esp. for the case where >>> you duplicate a call. There's a similar issue for -frounding-math >>> where intermediate FP status changes can be lost. match.pd simply >>> follows the SSA use-def chains without regarding memory side-effects. >>> >>> The transforms are guarded by flag_unsafe_math_optimizations but here >>> I think we need !HONOR_SIGN_DEPENDENT_ROUNDING, !flag_trapping_math >>> (exception state might be different for logN(a) - logN(b) -> logN(a/b), >>> at least WRT INEXACT?), and !flag_errno_math (because of the VDEFs). >>> >>> + /* Simplify logN(C/a) into logN(C)-logN(a). */ >>> + (simplify >>> + (logs (rdiv:s REAL_CST@0 @1)) >>> + (minus (logs @0) (logs @1))) >>> >>> I think you want >>> >>> (minus (logs! @0) (logs @1)) >>> >>> here to make sure we constant-fold. >>> >>> + (simplify >>> + (minus (logs:s @0) (logs:s @1)) >>> + (logs (rdiv @0 @1)))) >>> >>> I think that's somewhat dangerous for @1 == 0 given log for >>> zero arg results in -HUGE_VAL but a FP division by gives a NaN. >>> I'm not exactly sure whether !HONOR_INFINITIES && !HONOR_NANS >>> is good enough here. >>> >>> Your testcases probably all trigger during GENERIC folding, >>> bypassing the VDEF issue - you might want to try assigning >>> the comparison operands to tempoaries to run into the actual >>> issues. >> Dear Richard, >> Thanks for the review and suggesting the additional flags. I added >> - !HONOR_SIGN_DEPENDENT_ROUNDING >> - !flag_trapping_math >> - !flag_errno_math >> - !HONOR_INFINITIES >> - !HONOR_NANS >> as guard before the patterns. >> Can we add anything else to account for HUGE_VAL or will !HONOR_INFINITIES >> && !HONOR_NANS be enough? Or do you have a suggestion how I can check this? >> I validated again on aarch64 and x86_64. > > Can you change your patch attachment format to be at least text/plain > and ideally just inline it or use git send-mail? This way reviewing > is much easier. > > I'll quote it here for reference: > > (if (flag_unsafe_math_optimizations) > > [...] > > + (if (! HONOR_SIGN_DEPENDENT_ROUNDING (type) > + && ! HONOR_NANS (type) && ! HONOR_INFINITIES (type) > + && ! flag_trapping_math > + && ! flag_errno_math) > + (for logs (LOG LOG2 LOG10) > + /* Simplify logN(1.0/a) into -logN(a). */ > + (simplify > + (logs (rdiv:s real_onep@0 @1)) > + (negate (logs @1))) > + > + /* Simplify logN(C/a) into logN(C)-logN(a). */ > + (simplify > + (logs (rdiv:s REAL_CST@0 @1)) > + (minus (logs! @0) (logs @1))) > + > + /* Simplify logN(a)+logN(b) into logN(a*b). */ > + (simplify > + (plus (logs:s @0) (logs:s @1)) > + (logs (mult @0 @1))) > + > + /* Simplify logN(a)-logN(b) into logN(a/b). */ > + (simplify > + (minus (logs:s @0) (logs:s @1)) > + (logs (rdiv @0 @1))))) > > I'm OK with the extra guards added but I'm also not a IEEE FP expert > here. In the previous review I did mention the extra constraints > for specific sub-patterns IIRC. > > As a general note we might want to implement a HONOR_ERRNO (combined_fn). > Probably a bad name, builtin_can_set_errno (combined_fn) might be better, > for example above I'm not sure whether all of log(), log2() and log10() > can set errno and how we generally should handle errno when > -fno-math-errno isn't given and GCC wasn't configured specifically > to target for example glibc. For glibc it might be nice if we could > tell it we're not interested in errno from math functions and in this > way convey -fno-math-errno to it for example via a crtfastmath like > mechanism (or by calling alternate entry points). > > So, the patch is OK in case Joseph doesn't have any comments during > this week. Thanks, then I will commit it at the end of the day, if Joseph does not have further comments. Best, Jennifer > > Thanks, > Richard.
smime.p7s
Description: S/MIME cryptographic signature