On 6/19/07, tbp <[EMAIL PROTECTED]> wrote:

Indeed there are holes in every direction when you pull in such
transformation, and the cost of plugging every one of them would be
prohibitive; the next batch of c2d supposedly will leave you with ~6
cycles to make it worth for a sqrt.

My C2D has 6 cycles sqrt and 6 cycles div (measured by mubench, not
"from the specs"), but gas_dyn still runs 30% faster with reciprocals.

My point merely was that, considering one operation, you'd introduce
NaN for a not so special value (0) which, in a *fast* math scenario,
could be produced at any previous stage due to denormal clamping; with
no sane way to take care of.
Again, if you look at prior art (icc, AMD's manual...), that's the
only special case they covered.

sqrt(0.0) = NaN is indeed a bit strange. I'll add the trick with
min(x, maxval), as I think is faster than compare + pand.

Admittedly that's a trade off but not that unreasonable.

Now, an option to remove such transformations from -ffast-math
bag-o-tricks would be fine and would still buy gcc some Spec bragging
rights :)

Due to all combinations with rsqrt and rcpss, I'm a bit nervous about
including this by default into -ffast-math. OTOH, can somebody measure
the impact of -mrecip in spec?

Uros.

Reply via email to