Re: [PATCH] Optimize sin(atan(x)) and cos(atan(x)), take 3 (PR tree-optimization/86829)

Giuliano Augusto Faulin Belinassi Tue, 16 Oct 2018 10:34:11 -0700

Hello,

> cosatanf:
> .LFB3:
>         .loc 1 44 1 is_stmt 1
>         .cfi_startproc
>        @ args = 0, pretend = 0, frame = 0
>         @ frame_needed = 0, uses_anonymous_args = 0
> .LVL50:
>         .loc 1 45 5
>         .loc 1 44 1 is_stmt 0
>         push    {r4, lr}
>         .cfi_def_cfa_offset 8
>         .cfi_offset 4, -8
>         .cfi_offset 14, -4
>         .loc 1 45 12
>         bl      atanf
> .LVL51:
>         .loc 1 46 1
>          pop     {r4, lr}
>         .cfi_restore 14
>         .cfi_restore 4
>         .cfi_def_cfa_offset 0
>         .loc 1 45 12
>         b       cosf
> .LVL52:


This means that the expression 'cos (atan (x))' was not simplified
:-(. Did the check at line 4281 (svn revision 265209) in gcc/match.pd
failed?

> Yes, if we want to skip the test, but I'm not sure about that?
Since the only point of this patch is to simplify these kind of
expressions, and it is not being simplified at all in your arch as the
asm dump suggests, then it seems safe to skip all sinatan-*.c tests.
On Tue, Oct 16, 2018 at 1:25 PM Christophe Lyon
<christophe.l...@linaro.org> wrote:
>
> On Tue, 16 Oct 2018 at 18:04, Giuliano Augusto Faulin Belinassi
> <giuliano.belina...@usp.br> wrote:
> >
> > Hello, Christophe
> >     Could you please dump the assembly of cosatanf here?
> >
> >
> Sure:
>         .global cosatanf
>         .syntax unified
>         .arm
>         .fpu softvfp
>         .type   cosatanf, %function
> cosatanf:
> .LFB3:
>         .loc 1 44 1 is_stmt 1
>         .cfi_startproc
>         @ args = 0, pretend = 0, frame = 0
>         @ frame_needed = 0, uses_anonymous_args = 0
> .LVL50:
>         .loc 1 45 5
>         .loc 1 44 1 is_stmt 0
>         push    {r4, lr}
>         .cfi_def_cfa_offset 8
>         .cfi_offset 4, -8
>         .cfi_offset 14, -4
>         .loc 1 45 12
>         bl      atanf
> .LVL51:
>         .loc 1 46 1
>         pop     {r4, lr}
>         .cfi_restore 14
>         .cfi_restore 4
>         .cfi_def_cfa_offset 0
>         .loc 1 45 12
>         b       cosf
> .LVL52:
>         .cfi_endproc
> .LFE3:
>         .size   cosatanf, .-cosatanf
>
> So, upon entry we have r0=0x5f800000
> atanf returns 0x3fc90fdb
> and cosf returns 0xb33bbd2e
>
>
> > On Tue, Oct 16, 2018 at 12:23 PM Christophe Lyon
> > <christophe.l...@linaro.org> wrote:
> > >
> > > On Tue, 16 Oct 2018 at 17:15, Giuliano Augusto Faulin Belinassi
> > > <giuliano.belina...@usp.br> wrote:
> > > >
> > > > Hello. Sorry for the late reply.
> > > >
> > > > > but then cosatanf is computed as (ie there's not call to cosatanf()):
> > > > >        movw    r3, #48430
> > > > >        movt    r3, 45883
> > > > > so r3=0xb33bbd2e (-4.371139E-8) which is not zero.
> > > > Does this behavior is still present if we change the line 58 to:
> > > >     int __attribute__ ((optimize("O0")))
> > > > in sinatan-1.c?
> > >
> > > No, this now generates:
> > >         ldr     r0, [fp, #-32]
> > >         bl      cosatanf
> > > where r0=0x5f800000, and cosatanf() returns 0xb33bbd2e
> > > (ie the same value as what was computed by GCC)
> > >
> > > > On Fri, Oct 12, 2018 at 3:24 PM Christophe Lyon
> > > > <christophe.l...@linaro.org> wrote:
> > > > >
> > > > > On Fri, 12 Oct 2018 at 20:01, Giuliano Augusto Faulin Belinassi
> > > > > <giuliano.belina...@usp.br> wrote:
> > > > > >
> > > > > > > fc is built with:
> > > > > > >        mov     r0, #0
> > > > > > >        movt    r0, 24448
> > > > > > > so r0=0x5f800000 (1.8446744E19) which looks ok
> > > > > >
> > > > > > this is correct. My x86_64 yields the same value
> > > > > >
> > > > > > > but then cosatanf is computed as (ie there's not call to 
> > > > > > > cosatanf()):
> > > > > > >        movw    r3, #48430
> > > > > > >        movt    r3, 45883
> > > > > > > so r3=0xb33bbd2e (-4.371139E-8) which is not zero.
> > > > > >
> > > > > > Ugh. So GCC replaced the function call with a precomputed value? 
> > > > > > This
> > > > > > does not happens in my x86_64. Maybe it is Ofast's fault?
> > > > > > Also, it seems that GCC is precomputing cos(atan(x)) before the
> > > > > > substitution, as the following python script yields the same result:
> > > > > >
> > > > >
> > > > > Yes, I was surprised to see that.
> > > > >
> > > > > > import numpy as np
> > > > > > x = np.float32 (1.8446744e19)
> > > > > > print (np.cos (np.arctan (x))
> > > > > >
> > > > > > I would also like to add that -4.371139E-8 is very far away from
> > > > > > 5.421011E-20, which is a "more" correct value for this computation. 
> > > > > > So
> > > > > > returning 0 may be a better option?
> > > > > > On Fri, Oct 12, 2018 at 12:57 PM Jeff Law <l...@redhat.com> wrote:
> > > > > > >
> > > > > > > On 10/12/18 9:51 AM, Giuliano Augusto Faulin Belinassi wrote:
> > > > > > > > Hello
> > > > > > > >      What is the output of these functions on such arch? Since 
> > > > > > > > the
> > > > > > > > test didn't fail for the sinatan counterpart, an possible 
> > > > > > > > explanation
> > > > > > > > would be that the calculation of the sqrf, sqrt and sqrtl (lines
> > > > > > > > 62-64) yielded a number that is far behind of what it should be.
> > > > > > > > However, I am still not sure about this, so I will investigate
> > > > > > > > further.
> > > > > > > >      How about I  write a small program to check if the result
> > > > > > > > obtained by this calculation is what it should be?
> > > > > > > I suspect it's less the architecture and more the underlying 
> > > > > > > library.
> > > > > > > As Christophe mentioned, both issues are with newlib which is an C
> > > > > > > library primarily used in the emebedded space.  I believe it's 
> > > > > > > math code
> > > > > > > derives from early BSD libm and likely hasn't been stressed for 
> > > > > > > this
> > > > > > > kind of correctness.  It's lightly maintained (primarily for 
> > > > > > > cygwin).
> > > > > > >
> > > > > > > I'm going to run the testcases in my arm linux chroots.  That 
> > > > > > > should
> > > > > > > allow us to rule out codegen issues as those chroots will be 
> > > > > > > using glibc
> > > > > > > for their math library.
> > > > > > >
> > > > > > > jeff

Re: [PATCH] Optimize sin(atan(x)) and cos(atan(x)), take 3 (PR tree-optimization/86829)

Reply via email to