RE: [PATCH] [aarch64] Implemented reciprocal square root (rsqrt) estimation in -ffast-math

Kumar, Venkataramanan Wed, 24 Jun 2015 22:15:06 -0700

Hi, 

If I understand correct, current implementation replaces


fdiv 
fsqrt

 by  
 frsqrte
for i=0 to 3
fmul
frsqrts  
fmul

So I think gains depends latency of  frsqrts  insn.

I see patch has patterns for  vector versions of frsqrts, but does not enable 
them?

Regards,
Venkat.

> -----Original Message-----
> From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
> ow...@gcc.gnu.org] On Behalf Of Dr. Philipp Tomsich
> Sent: Wednesday, June 24, 2015 10:22 PM
> To: Evandro Menezes
> Cc: Benedikt Huber; gcc-patches@gcc.gnu.org
> Subject: Re: [PATCH] [aarch64] Implemented reciprocal square root (rsqrt)
> estimation in -ffast-math
> 
> Evandro,
> 
> We’ve seen a 28% speed-up on gromacs in SPECfp for the (scalar) reciprocal
> sqrt.
> 
> Also, the “reciprocal divide” patches are floating around in various of our 
> git-
> tree, but aren’t ready for public consumption, yet… I’ll leave Benedikt to
> comment on potential timelines for getting that pushed out.
> 
> Best,
> Philipp.
> 
> > On 24 Jun 2015, at 18:42, Evandro Menezes <e.mene...@samsung.com>
> wrote:
> >
> > Benedikt,
> >
> > You beat me to it! :-)  Do you have the implementation for dividing
> > using the Newton series as well?
> >
> > I'm not sure that the series is always for all data types and on all
> > processors.  It would be useful to allow each AArch64 processor to
> > enable this or not depending on the data type.  BTW, do you have some
> > tests showing the speed up?
> >
> > Thank you,
> >
> > --
> > Evandro Menezes                              Austin, TX
> >
> >> -----Original Message-----
> >> From: gcc-patches-ow...@gcc.gnu.org
> >> [mailto:gcc-patches-ow...@gcc.gnu.org]
> > On
> >> Behalf Of Benedikt Huber
> >> Sent: Thursday, June 18, 2015 7:04
> >> To: gcc-patches@gcc.gnu.org
> >> Cc: benedikt.hu...@theobroma-systems.com;
> philipp.tomsich@theobroma-
> >> systems.com
> >> Subject: [PATCH] [aarch64] Implemented reciprocal square root (rsqrt)
> >> estimation in -ffast-math
> >>
> >> arch64 offers the instructions frsqrte and frsqrts, for rsqrt
> >> estimation
> > and
> >> a Newton-Raphson step, respectively.
> >> There are ARMv8 implementations where this is faster than using fdiv
> >> and rsqrt.
> >> It runs three steps for double and two steps for float to achieve the
> > needed
> >> precision.
> >>
> >> There is one caveat and open question.
> >> Since -ffast-math enables flush to zero intermediate values between
> >> approximation steps will be flushed to zero if they are denormal.
> >> E.g. This happens in the case of rsqrt (DBL_MAX) and rsqrtf (FLT_MAX).
> >> The test cases pass, but it is unclear to me whether this is expected
> >> behavior with -ffast-math.
> >>
> >> The patch applies to commit:
> >> svn+ssh://gcc.gnu.org/svn/gcc/trunk@224470
> >>
> >> Please consider including this patch.
> >> Thank you and best regards,
> >> Benedikt Huber
> >>
> >> Benedikt Huber (1):
> >>  2015-06-15  Benedikt Huber  <benedikt.huber@theobroma-
> systems.com>
> >>
> >> gcc/ChangeLog                            |   9 +++
> >> gcc/config/aarch64/aarch64-builtins.c    |  60 ++++++++++++++++
> >> gcc/config/aarch64/aarch64-protos.h      |   2 +
> >> gcc/config/aarch64/aarch64-simd.md       |  27 ++++++++
> >> gcc/config/aarch64/aarch64.c             |  63 +++++++++++++++++
> >> gcc/config/aarch64/aarch64.md            |   3 +
> >> gcc/testsuite/gcc.target/aarch64/rsqrt.c | 113
> >> +++++++++++++++++++++++++++++++
> >> 7 files changed, 277 insertions(+)
> >> create mode 100644 gcc/testsuite/gcc.target/aarch64/rsqrt.c
> >>
> >> --
> >> 1.9.1
> > <Mail Attachment.eml>

RE: [PATCH] [aarch64] Implemented reciprocal square root (rsqrt) estimation in -ffast-math

Reply via email to