Philipp, I think that execute_cse_reciprocals_1() applies only when the denominator is known at compile-time, otherwise the division stays. It doesn't seem to know whether the target supports the approximate reciprocal or not.
Cheers, -- Evandro Menezes Austin, TX > -----Original Message----- > From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-ow...@gcc.gnu.org] On > Behalf Of Dr. Philipp Tomsich > Sent: Wednesday, June 24, 2015 15:08 > To: Evandro Menezes > Cc: Benedikt Huber; gcc-patches@gcc.gnu.org > Subject: Re: [PATCH] [aarch64] Implemented reciprocal square root (rsqrt) > estimation in -ffast-math > > Evandro, > > Shouldn't ‘execute_cse_reciprocals_1’ take care of this, once the reciprocal- > division is implemented? > Do you think there’s additional work needed to catch all cases/opportunities? > > Best, > Philipp. > > > On 24 Jun 2015, at 20:19, Evandro Menezes <e.mene...@samsung.com> wrote: > > > > Benedikt, > > > > Are you developing the reciprocal approximation just for 1/x proper or for > any division, as in x/y = x * 1/y? > > > > Thank you, > > > > -- > > Evandro Menezes Austin, TX > > > > > >> -----Original Message----- > >> From: Benedikt Huber [mailto:benedikt.hu...@theobroma-systems.com] > >> Sent: Wednesday, June 24, 2015 12:11 > >> To: Dr. Philipp Tomsich > >> Cc: Evandro Menezes; gcc-patches@gcc.gnu.org > >> Subject: Re: [PATCH] [aarch64] Implemented reciprocal square root > >> (rsqrt) estimation in -ffast-math > >> > >> Evandro, > >> > >> Yes, we also have the 1/x approximation. > >> However we do not have the test cases yet, and it also would need > >> some clean up. > >> I am going to provide a patch for that soon (say next week). > >> Also, for this optimization we have *not* yet found a benchmark with > >> significant improvements. > >> > >> Best Regards, > >> Benedikt > >> > >> > >>> On 24 Jun 2015, at 18:52, Dr. Philipp Tomsich > >>> <philipp.tomsich@theobroma- > >> systems.com> wrote: > >>> > >>> Evandro, > >>> > >>> We’ve seen a 28% speed-up on gromacs in SPECfp for the (scalar) > >>> reciprocal > >> sqrt. > >>> > >>> Also, the “reciprocal divide” patches are floating around in various > >>> of our git-tree, but aren’t ready for public consumption, yet… I’ll > >>> leave Benedikt to comment on potential timelines for getting that > >>> pushed > >> out. > >>> > >>> Best, > >>> Philipp. > >>> > >>>> On 24 Jun 2015, at 18:42, Evandro Menezes <e.mene...@samsung.com> wrote: > >>>> > >>>> Benedikt, > >>>> > >>>> You beat me to it! :-) Do you have the implementation for dividing > >>>> using the Newton series as well? > >>>> > >>>> I'm not sure that the series is always for all data types and on > >>>> all processors. It would be useful to allow each AArch64 processor > >>>> to enable this or not depending on the data type. BTW, do you have > >>>> some tests showing the speed up? > >>>> > >>>> Thank you, > >>>> > >>>> -- > >>>> Evandro Menezes Austin, TX > >>>> > >>>>> -----Original Message----- > >>>>> From: gcc-patches-ow...@gcc.gnu.org > >>>>> [mailto:gcc-patches-ow...@gcc.gnu.org] > >>>> On > >>>>> Behalf Of Benedikt Huber > >>>>> Sent: Thursday, June 18, 2015 7:04 > >>>>> To: gcc-patches@gcc.gnu.org > >>>>> Cc: benedikt.hu...@theobroma-systems.com; > >>>>> philipp.tomsich@theobroma- systems.com > >>>>> Subject: [PATCH] [aarch64] Implemented reciprocal square root > >>>>> (rsqrt) estimation in -ffast-math > >>>>> > >>>>> arch64 offers the instructions frsqrte and frsqrts, for rsqrt > >>>>> estimation > >>>> and > >>>>> a Newton-Raphson step, respectively. > >>>>> There are ARMv8 implementations where this is faster than using > >>>>> fdiv and rsqrt. > >>>>> It runs three steps for double and two steps for float to achieve > >>>>> the > >>>> needed > >>>>> precision. > >>>>> > >>>>> There is one caveat and open question. > >>>>> Since -ffast-math enables flush to zero intermediate values > >>>>> between approximation steps will be flushed to zero if they are > denormal. > >>>>> E.g. This happens in the case of rsqrt (DBL_MAX) and rsqrtf (FLT_MAX). > >>>>> The test cases pass, but it is unclear to me whether this is > >>>>> expected behavior with -ffast-math. > >>>>> > >>>>> The patch applies to commit: > >>>>> svn+ssh://gcc.gnu.org/svn/gcc/trunk@224470 > >>>>> > >>>>> Please consider including this patch. > >>>>> Thank you and best regards, > >>>>> Benedikt Huber > >>>>> > >>>>> Benedikt Huber (1): > >>>>> 2015-06-15 Benedikt Huber <benedikt.hu...@theobroma-systems.com> > >>>>> > >>>>> gcc/ChangeLog | 9 +++ > >>>>> gcc/config/aarch64/aarch64-builtins.c | 60 ++++++++++++++++ > >>>>> gcc/config/aarch64/aarch64-protos.h | 2 + > >>>>> gcc/config/aarch64/aarch64-simd.md | 27 ++++++++ > >>>>> gcc/config/aarch64/aarch64.c | 63 +++++++++++++++++ > >>>>> gcc/config/aarch64/aarch64.md | 3 + > >>>>> gcc/testsuite/gcc.target/aarch64/rsqrt.c | 113 > >>>>> +++++++++++++++++++++++++++++++ > >>>>> 7 files changed, 277 insertions(+) create mode 100644 > >>>>> gcc/testsuite/gcc.target/aarch64/rsqrt.c > >>>>> > >>>>> -- > >>>>> 1.9.1 > >>>> <Mail Attachment.eml> > >>> > > > >