On Thu, Oct 20, 2011 at 4:45 PM, Joseph S. Myers
<jos...@codesourcery.com> wrote:

>> The patch was tested on x86_64-pc-linux-gnu, but I would like Joseph
>> to check if I didn't mess something with options handling.
>
> I have no comments on the option handling in this patch.
>
>> +for vectorized single float division and vectorized sqrtf(x) already with
>
> @code{sqrtf (@var{x})}

Thanks - fixed, with a similar fix in the previous paragraph.

I also found a PR that deals with vectorized reciprocal, so I referred
to the PR in the ChangeLog entry:

2011-10-20  Uros Bizjak  <ubiz...@gmail.com>

        PR target/47989
        * config/i386/i386.h (RECIP_MASK_DEFAULT): New define.
        * config/i386/i386.op (recip_mask): Initialize with RECIP_MASK_DEFAULT.
        * doc/invoke.texi (ix86 Options, -mrecip): Document that GCC
        implements vectorized single float division and vectorized sqrtf(x)
        with reciprocal sequence with additional Newton-Raphson step with
        -ffast-math.

Attached is the patch that was committed to mainline SVN. Encouraged
by Michael's results, let's see what automated benchmark testers will
show.

Uros.
Index: config/i386/i386.h
===================================================================
--- config/i386/i386.h  (revision 180255)
+++ config/i386/i386.h  (working copy)
@@ -2322,6 +2322,7 @@
 #define RECIP_MASK_VEC_SQRT    0x08
 #define RECIP_MASK_ALL (RECIP_MASK_DIV | RECIP_MASK_SQRT \
                         | RECIP_MASK_VEC_DIV | RECIP_MASK_VEC_SQRT)
+#define RECIP_MASK_DEFAULT (RECIP_MASK_VEC_DIV | RECIP_MASK_VEC_SQRT)
 
 #define TARGET_RECIP_DIV       ((recip_mask & RECIP_MASK_DIV) != 0)
 #define TARGET_RECIP_SQRT      ((recip_mask & RECIP_MASK_SQRT) != 0)
Index: config/i386/i386.opt
===================================================================
--- config/i386/i386.opt        (revision 180255)
+++ config/i386/i386.opt        (working copy)
@@ -32,7 +32,7 @@
 HOST_WIDE_INT ix86_isa_flags_explicit
 
 TargetVariable
-int recip_mask
+int recip_mask = RECIP_MASK_DEFAULT
 
 Variable
 int recip_mask_explicit
Index: doc/invoke.texi
===================================================================
--- doc/invoke.texi     (revision 180255)
+++ doc/invoke.texi     (working copy)
@@ -12922,7 +12922,12 @@
 of the non-reciprocal instruction, the precision of the sequence can be
 decreased by up to 2 ulp (i.e. the inverse of 1.0 equals 0.99999994).
 
-Note that GCC implements 1.0f/sqrtf(x) in terms of RSQRTSS (or RSQRTPS)
+Note that GCC implements @code{1.0f/sqrtf(@var{x})} in terms of RSQRTSS
+(or RSQRTPS) already with @option{-ffast-math} (or the above option
+combination), and doesn't need @option{-mrecip}.
+
+Also note that GCC emits the above sequence with additional Newton-Raphson step
+for vectorized single float division and vectorized @code{sqrtf(@var{x})}
 already with @option{-ffast-math} (or the above option combination), and
 doesn't need @option{-mrecip}.
 

Reply via email to