Richard Guenther wrote:
See also http://www.suse.de/~gcctest/c++bench/polyhedron/analysis.html
(same conclusion for gas_dyn).

Thanks, I seem to have completely missed that page (though I was aware of your polyhedron tester).

>On 4/27/07, Janne Blomqvist <[EMAIL PROTECTED]> wrote: >> The reason, it seems, is that ifort (and presumably other commercial
compilers with competitive scores in gas_dyn) avoids calculating
divisions and square roots, replacing them with reciprocals and
reciprocal square roots. E.g. in EOS sqrt(a/b) can be calculated as
1/sqrt(b*(1/a)). This has a big impact on performance, since the SSE
instruction set contains very fast instructions for this, rcpps, rcpss,
rsqrtps, rsqrtss (PPC/Altivec also has equivalent instructions). These
instructions have latencies of 1-2 cycles vs. dozens or even hundreds of
cycles for normal division and square root.  The price to be paid for
this speed is that these reciprocal instructions have an accuracy of
only 12 bits, so clearly they can be enabled only for -ffast-math. And
they are available only for single precision. I'll file a
missed-optimization PR about this.

I think that even with -ffast-math 12 bits accuracy is not ok.  There is
the possibility of doing another newton iteration step to improve
accuracy, that would be ok for -ffast-math.  We can, though, add an
extra flag -msserecip or however you'd call it to enable use of the
instructions with less accuracy.

I agree it can be an issue, but OTOH people who care about precision probably 1. avoid -ffast-math 2. use double precision (where these reciprocal instrs are not available). Intel calls it -no-prec-div, but it's enabled for the "-fast" catch-all option.

On a related note, our beloved competitors generally have some high level flag for combining all these fancy and potentially unsafe optimizations (e.g. -O4, -fast, -fastsse, -Ofast, etc.). For gcc, at least FP benchmarks seem to do generally well with something like "-O3 -funroll-loops -ftree-vectorize -ffast-math -march=native -mfpmath=sse", but it's quite a mouthful.

--
Janne Blomqvist

Reply via email to