Re: weird optimization in sin+cos, x86 backend

James Courtier-Dutton Thu, 09 Feb 2012 07:56:06 -0800

On 9 February 2012 14:51, James Courtier-Dutton <[email protected]> wrote:
> 2012/2/9 Andrew Haley <[email protected]>:
>> On 02/09/2012 01:38 PM, Tim Prince wrote:
>>> x87 built-ins should be a fair compromise between speed, code size, and
>>> accuracy, for long double, on most CPUs.  As Richard says, it's
>>> certainly possible to do better in the context of SSE, but gcc doesn't
>>> know anything about the quality of math libraries present; it doesn't
>>> even take into account whether it's glibc or something else.
>>
>> Yeah.  On x86_64 glibc, we should really get the sincos bg fixed.  And
>> it really is a bug: cos() and sin() are correctly rounded but sincos()
>> isn't.
>>
>> Andrew.
>
> Given:
>
> gcc -g -O0 -c -o sincos1.o sincos1.c
> gcc -static -g -o sincos1 sincos1.o -lm
>
> ./sincos1
> sin = -8.52200849767188795e-01    (uses xmm register intructions)
> sinl = 0.46261304076460176          (uses fprem and fsin)
> sincos = 4.62613040764601746e-01 (uses fprem and fsin)
> sincosl = 0.46261304076460176       (uses fprem and fsin)
>
> It looks to me that only sin() is evaluation correctly using xmm
> register instructions.
> sinl, sincos, and sincosl all use fprem and fsin and fsincos.
>
> From my analysis, fprem is causing a lot of the problems.
> Using a combination of xmm register instructions instead of fprem
> might cure our problems with fsincos.
> Alternatively, make both sin() and sinl() use xmm register intructions.


Results for x86_64
gcc -g -O0 -c -o sincos1.o sincos1.c
gcc -static -g -o sincos1 sincos1.o -lm

./sincos1
sin = -8.52200849767188795e-01    (uses xmm register intructions)
sinl = 0.46261304076460176          (uses fprem and fsin)
sincos = 4.62613040764601746e-01 (uses fprem and fsin)
sincosl = 0.46261304076460176       (uses fprem and fsin)

Only sin() gets an accurate answer.

Results when compiled for 32bit x86.
gcc -m32 -g -O0 -c -o sincos1.o sincos1.c
gcc -m32 -static -g -o sincos1 sincos1.o -lm

./sincos1
sin = 4.62613040764601746e-01
sinl = 0.46261304076460176
sincos = 4.62613040764601746e-01
sincosl = 0.46261304076460176

Which are all inaccurate.

So, we have a case of the same program compiled for 32bit might give
different floating point results than the same source compiled for
64bit.

From what I can tell, the xmm register instructions on x86_64 are
using 128bit precision which probably explains why the result is more
accurate.

Re: weird optimization in sin+cos, x86 backend

Reply via email to