Hello!

This patch builds on recent patch by Michael (that implemented
fine-grained control on -mrecip option) and with -ffast-math emits
reciprocal sequences with additional NR step for vectorized SFmode
division and vectorized sqrtf(x).

2011-10-20  Uros Bizjak  <ubiz...@gmail.com>

        * config/i386/i386.h (RECIP_MASK_DEFAULT): New define.
        * config/i386/i386.op (recip_mask): Initialize with RECIP_MASK_DEFAULT.
        * doc/invoke.texi (mrecip): Document that GCC implements vectorized
        single float division and vectorized sqrtf(x) with reciprocal sequence
        with additional Newton-Raphson step with -ffast-math.

The patch was tested on x86_64-pc-linux-gnu, but I would like Joseph
to check if I didn't mess something with options handling.

The effect of the patch is 7% faster gas_dyn from polyhedron testsuite
on corei7-avx.

Uros.
Index: config/i386/i386.h
===================================================================
--- config/i386/i386.h  (revision 180176)
+++ config/i386/i386.h  (working copy)
@@ -2322,6 +2322,7 @@
 #define RECIP_MASK_VEC_SQRT    0x08
 #define RECIP_MASK_ALL (RECIP_MASK_DIV | RECIP_MASK_SQRT \
                         | RECIP_MASK_VEC_DIV | RECIP_MASK_VEC_SQRT)
+#define RECIP_MASK_DEFAULT (RECIP_MASK_VEC_DIV | RECIP_MASK_VEC_SQRT)
 
 #define TARGET_RECIP_DIV       ((recip_mask & RECIP_MASK_DIV) != 0)
 #define TARGET_RECIP_SQRT      ((recip_mask & RECIP_MASK_SQRT) != 0)
Index: config/i386/i386.opt
===================================================================
--- config/i386/i386.opt        (revision 180176)
+++ config/i386/i386.opt        (working copy)
@@ -32,7 +32,7 @@
 HOST_WIDE_INT ix86_isa_flags_explicit
 
 TargetVariable
-int recip_mask
+int recip_mask = RECIP_MASK_DEFAULT
 
 Variable
 int recip_mask_explicit
Index: doc/invoke.texi
===================================================================
--- doc/invoke.texi     (revision 180176)
+++ doc/invoke.texi     (working copy)
@@ -12927,6 +12927,11 @@
 already with @option{-ffast-math} (or the above option combination), and
 doesn't need @option{-mrecip}.
 
+Also note that GCC emits the above sequence with additional Newton-Raphson step
+for vectorized single float division and vectorized sqrtf(x) already with
+@option{-ffast-math} (or the above option combination), and doesn't need
+@option{-mrecip}.
+
 @item -mrecip=@var{opt}
 @opindex mrecip=opt
 This option allows to control which reciprocal estimate instructions

Reply via email to