> -----Original Message-----
> I agree - for example powerpc has -mrecip= to control which instructions
> to use (float/double rsqrt or inverse) and -mrecip-precision to
> specify whether further iteration is done or not.
> 
> x86 has similar but does always perform newton raphson iteration,
> documenting 2 ulp instead of 0.5 ulp precision.
> 
> Your suggested huge reduction in precision isn't usually acceptable
> and should be always explicitely enabled.

There isn't a problem with *this* patch (although we do have existing accuracy 
issues thanks to previous documents lacking the information).

The "inaccurate" instructions are single-precision only, and therefore 
acceptable with -ffast-math.

Kwok intends to provide vectorized library calls for the double-precision and 
-fno-fast-math cases.

In general I want to avoid adding extra arch-specific options; partly because 
approximately no one will use them, and partly because the amdgcn compiler is 
almost always hidden behind an x86_64 compiler.

Andrew

Reply via email to