On Tue, Aug 21, 2018 at 11:27 PM Jeff Law <l...@redhat.com> wrote: > > On 08/21/2018 02:08 PM, Giuliano Augusto Faulin Belinassi wrote: > >> Just as an example, compare the results for > >> x = 0x1.fffffffffffffp1023 > > > > Thank you for your answer and the counterexample. :-) > > > >> If we had useful range info on floats we might conditionalize such > >> transforms appropriately. Or we can enable it on floats and do > >> the sqrt (x*x + 1) in double. > > > > I think I managed to find a bound were the transformation can be done > > without overflow harm, however I don't know about rounding problems, > > however > > > > Suppose we are handling double precision floats for now. The function > > x/sqrt(1 + x*x) approaches 1 when x is big enough. How big must be x > > for the function be 1? > > > > Since sqrt(1 + x*x) > x when x > 1, then we must find a value to x > > that x/sqrt(1 + x*x) < eps, where eps is the biggest double smaller > > than 1. Such eps must be around 1 - 2^-53 in ieee double because the > > mantissa has 52 bits. Solving for x yields that x must be somewhat > > bigger than 6.7e7, so let's take 1e8. Therefore if abs(x) > 1e8, it is > > enough to return copysign(1, x). Notice that this arguments is also > > valid for x = +-inf (if target supports that) because sin(atan(+-inf)) > > = +-1, and it can be extended to other floating point formats.The > > following test code illustrates my point: > > https://pastebin.com/M4G4neLQ > > > > This might still be faster than calculating sin(atan(x)) explicitly. > > > > Please let me know if this is unfeasible. :-) > The problem is our VRP implementation doesn't handle any floating point > types at this time. If we had range information for FP types, then > this kind of analysis is precisely what we'd need to do the > transformation regardless of -ffast-math.
I think his idea was to emit a runtime test? You'd have to use a COND_EXPR and evaluate both arms at the same time because match.pd doesn't allow you to create control flow. Note the rounding issue is also real given for large x you strip away lower mantissa bits when computing x*x. Richard. > jeff > > > > Giuliano. > > > > On Tue, Aug 21, 2018 at 11:27 AM, Jeff Law <l...@redhat.com> wrote: > >> On 08/21/2018 02:02 AM, Richard Biener wrote: > >>> On Mon, Aug 20, 2018 at 9:40 PM Jeff Law <l...@redhat.com> wrote: > >>>> > >>>> On 08/04/2018 07:22 AM, Giuliano Augusto Faulin Belinassi wrote: > >>>>> Closes bug #86829 > >>>>> > >>>>> Description: Adds substitution rules for both sin(atan(x)) and > >>>>> cos(atan(x)). These formulas are replaced by x / sqrt(x*x + 1) and 1 / > >>>>> sqrt(x*x + 1) respectively, providing up to 10x speedup. This identity > >>>>> can be proved mathematically. > >>>>> > >>>>> Changelog: > >>>>> > >>>>> 2018-08-03 Giuliano Belinassi <giuliano.belina...@usp.br> > >>>>> > >>>>> * match.pd: add simplification rules to sin(atan(x)) and > >>>>> cos(atan(x)). > >>>>> > >>>>> Bootstrap and Testing: > >>>>> There were no unexpected failures in a proper testing in GCC 8.1.0 > >>>>> under a x86_64 running Ubuntu 18.04. > >>>> I understand these are mathematical identities. But floating point > >>>> arthmetic in a compiler isn't nearly that clean :-) We have to worry > >>>> about overflows, underflows, rounding, and the simple fact that many > >>>> floating point numbers can't be exactly represented. > >>>> > >>>> Just as an example, compare the results for > >>>> x = 0x1.fffffffffffffp1023 > >>>> > >>>> I think sin(atan (x)) is well defined in that case. But the x*x isn't > >>>> because it overflows. > >>>> > >>>> So I think this has to be somewhere under the -ffast-math umbrella. > >>>> And the testing requirements for that are painful -- you have to verify > >>>> it doesn't break the spec benchmark. > >>>> > >>>> I know Richi acked in the PR, but that might have been premature. > >>> > >>> It's under the flag_unsafe_math_optimizations umbrella, but sure, > >>> a "proper" way to optimize this would be to further expand > >>> sqrt (x*x + 1) to fabs(x) + ... (extra terms) that are precise enough > >>> and not have this overflow issue. > >>> > >>> But yes, I do not find (quickly skimming) other simplifications that > >>> have this kind of overflow issue (in fact I do remember raising > >>> overflow/underflow issues for other patches). > >>> > >>> Thus approval withdrawn. > >> At least until we can do some testing around spec. There's also a patch > >> for logarithm addition/subtraction from MCC CS and another from Giuliano > >> for hyperbolics that need testing with spec. I think that getting that > >> testing done anytime between now and stage1 close is sufficient -- none > >> of the 3 patches is particularly complex. > >> > >> > >>> > >>> If we had useful range info on floats we might conditionalize such > >>> transforms appropriately. Or we can enable it on floats and do > >>> the sqrt (x*x + 1) in double. > >> Yea. I keep thinking about what it might take to start doing some light > >> VRP of floating point objects. I'd originally been thinking to just > >> track 0.0 and exceptional value state. But the more I ponder the more I > >> think we could use the range information to allow transformations that > >> are currently guarded by the -ffast-math family of options. > >> > >> jeff > >> >