On Wed, Sep 14, 2011 at 04:00:02AM -0400, David Miller wrote: > > Eric, this is a preliminary version of the FMA patch I've been > working on. Just so you can see what I'm doing. > > First, ignore the fact that there are two configure tests for the > presence of support for these instructions. I'm busy normalizing the > -xarch options which binutils supports so that they are the same as > Sun AS and therefore just one test is necessary. > > Second, like rs6000 the sparc negate fused multiply instructions > negate the full result, not the multiply result. So we cannot use > those instructions for the fnmadf4/fnmsdf4/fnmasf4/fnmssf4 patterns. > Since rs6000 provides patterns for such negate operations (presumably > just in case the combiner creates a match) I have done so for sparc > as well. > > I was really surprised that cpu designers haven't settled on a > consistent formula for negated fused multiply add/sub instructions. > Ho hum...
On the powerpc, we have an issue with Spec 2006 and calculix when FMAs are generated and -ffast-math is used, where line 307 of rubber.f is: tt=datan2(dsqrt(1.d0-cn*cn),cn)/3.d0 The FNMSUB instruction generates a -0.0 while doing the multiply and subtract generates +0.0. Dsqrt returns a -0.0 when given a -0.0, and datan2 (-0.0, 1.0) returns -0.0. Note, calculix is nothing but nested FMAs, and if you disable FMAs you get about a 10% drop in performance. I suspect that the issue may be a powerpc backend issue where the wrong comparison is generated, but I haven't tracked it down. -- Michael Meissner, IBM 5 Technology Place Drive, M/S 2757, Westford, MA 01886-3141, USA meiss...@linux.vnet.ibm.com fax +1 (978) 399-6899