Philipp,

I think that execute_cse_reciprocals_1() applies only when the denominator is 
known at compile-time, otherwise the division stays.  It doesn't seem to know 
whether the target supports the approximate reciprocal or not.

Cheers,

-- 
Evandro Menezes                              Austin, TX


> -----Original Message-----
> From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-ow...@gcc.gnu.org] On
> Behalf Of Dr. Philipp Tomsich
> Sent: Wednesday, June 24, 2015 15:08
> To: Evandro Menezes
> Cc: Benedikt Huber; gcc-patches@gcc.gnu.org
> Subject: Re: [PATCH] [aarch64] Implemented reciprocal square root (rsqrt)
> estimation in -ffast-math
> 
> Evandro,
> 
> Shouldn't ‘execute_cse_reciprocals_1’ take care of this, once the reciprocal-
> division is implemented?
> Do you think there’s additional work needed to catch all cases/opportunities?
> 
> Best,
> Philipp.
> 
> > On 24 Jun 2015, at 20:19, Evandro Menezes <e.mene...@samsung.com> wrote:
> >
> > Benedikt,
> >
> > Are you developing the reciprocal approximation just for 1/x proper or for
> any division, as in x/y = x * 1/y?
> >
> > Thank you,
> >
> > --
> > Evandro Menezes                              Austin, TX
> >
> >
> >> -----Original Message-----
> >> From: Benedikt Huber [mailto:benedikt.hu...@theobroma-systems.com]
> >> Sent: Wednesday, June 24, 2015 12:11
> >> To: Dr. Philipp Tomsich
> >> Cc: Evandro Menezes; gcc-patches@gcc.gnu.org
> >> Subject: Re: [PATCH] [aarch64] Implemented reciprocal square root
> >> (rsqrt) estimation in -ffast-math
> >>
> >> Evandro,
> >>
> >> Yes, we also have the 1/x approximation.
> >> However we do not have the test cases yet, and it also would need
> >> some clean up.
> >> I am going to provide a patch for that soon (say next week).
> >> Also, for this optimization we have *not* yet found a benchmark with
> >> significant improvements.
> >>
> >> Best Regards,
> >> Benedikt
> >>
> >>
> >>> On 24 Jun 2015, at 18:52, Dr. Philipp Tomsich
> >>> <philipp.tomsich@theobroma-
> >> systems.com> wrote:
> >>>
> >>> Evandro,
> >>>
> >>> We’ve seen a 28% speed-up on gromacs in SPECfp for the (scalar)
> >>> reciprocal
> >> sqrt.
> >>>
> >>> Also, the “reciprocal divide” patches are floating around in various
> >>> of our git-tree, but aren’t ready for public consumption, yet… I’ll
> >>> leave Benedikt to comment on potential timelines for getting that
> >>> pushed
> >> out.
> >>>
> >>> Best,
> >>> Philipp.
> >>>
> >>>> On 24 Jun 2015, at 18:42, Evandro Menezes <e.mene...@samsung.com> wrote:
> >>>>
> >>>> Benedikt,
> >>>>
> >>>> You beat me to it! :-)  Do you have the implementation for dividing
> >>>> using the Newton series as well?
> >>>>
> >>>> I'm not sure that the series is always for all data types and on
> >>>> all processors.  It would be useful to allow each AArch64 processor
> >>>> to enable this or not depending on the data type.  BTW, do you have
> >>>> some tests showing the speed up?
> >>>>
> >>>> Thank you,
> >>>>
> >>>> --
> >>>> Evandro Menezes                              Austin, TX
> >>>>
> >>>>> -----Original Message-----
> >>>>> From: gcc-patches-ow...@gcc.gnu.org
> >>>>> [mailto:gcc-patches-ow...@gcc.gnu.org]
> >>>> On
> >>>>> Behalf Of Benedikt Huber
> >>>>> Sent: Thursday, June 18, 2015 7:04
> >>>>> To: gcc-patches@gcc.gnu.org
> >>>>> Cc: benedikt.hu...@theobroma-systems.com;
> >>>>> philipp.tomsich@theobroma- systems.com
> >>>>> Subject: [PATCH] [aarch64] Implemented reciprocal square root
> >>>>> (rsqrt) estimation in -ffast-math
> >>>>>
> >>>>> arch64 offers the instructions frsqrte and frsqrts, for rsqrt
> >>>>> estimation
> >>>> and
> >>>>> a Newton-Raphson step, respectively.
> >>>>> There are ARMv8 implementations where this is faster than using
> >>>>> fdiv and rsqrt.
> >>>>> It runs three steps for double and two steps for float to achieve
> >>>>> the
> >>>> needed
> >>>>> precision.
> >>>>>
> >>>>> There is one caveat and open question.
> >>>>> Since -ffast-math enables flush to zero intermediate values
> >>>>> between approximation steps will be flushed to zero if they are
> denormal.
> >>>>> E.g. This happens in the case of rsqrt (DBL_MAX) and rsqrtf (FLT_MAX).
> >>>>> The test cases pass, but it is unclear to me whether this is
> >>>>> expected behavior with -ffast-math.
> >>>>>
> >>>>> The patch applies to commit:
> >>>>> svn+ssh://gcc.gnu.org/svn/gcc/trunk@224470
> >>>>>
> >>>>> Please consider including this patch.
> >>>>> Thank you and best regards,
> >>>>> Benedikt Huber
> >>>>>
> >>>>> Benedikt Huber (1):
> >>>>> 2015-06-15  Benedikt Huber  <benedikt.hu...@theobroma-systems.com>
> >>>>>
> >>>>> gcc/ChangeLog                            |   9 +++
> >>>>> gcc/config/aarch64/aarch64-builtins.c    |  60 ++++++++++++++++
> >>>>> gcc/config/aarch64/aarch64-protos.h      |   2 +
> >>>>> gcc/config/aarch64/aarch64-simd.md       |  27 ++++++++
> >>>>> gcc/config/aarch64/aarch64.c             |  63 +++++++++++++++++
> >>>>> gcc/config/aarch64/aarch64.md            |   3 +
> >>>>> gcc/testsuite/gcc.target/aarch64/rsqrt.c | 113
> >>>>> +++++++++++++++++++++++++++++++
> >>>>> 7 files changed, 277 insertions(+) create mode 100644
> >>>>> gcc/testsuite/gcc.target/aarch64/rsqrt.c
> >>>>>
> >>>>> --
> >>>>> 1.9.1
> >>>> <Mail Attachment.eml>
> >>>
> >
> >

Reply via email to