Evandro, We’ve seen a 28% speed-up on gromacs in SPECfp for the (scalar) reciprocal sqrt.
Also, the “reciprocal divide” patches are floating around in various of our git-tree, but aren’t ready for public consumption, yet… I’ll leave Benedikt to comment on potential timelines for getting that pushed out. Best, Philipp. > On 24 Jun 2015, at 18:42, Evandro Menezes <e.mene...@samsung.com> wrote: > > Benedikt, > > You beat me to it! :-) Do you have the implementation for dividing using > the Newton series as well? > > I'm not sure that the series is always for all data types and on all > processors. It would be useful to allow each AArch64 processor to enable > this or not depending on the data type. BTW, do you have some tests showing > the speed up? > > Thank you, > > -- > Evandro Menezes Austin, TX > >> -----Original Message----- >> From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-ow...@gcc.gnu.org] > On >> Behalf Of Benedikt Huber >> Sent: Thursday, June 18, 2015 7:04 >> To: gcc-patches@gcc.gnu.org >> Cc: benedikt.hu...@theobroma-systems.com; philipp.tomsich@theobroma- >> systems.com >> Subject: [PATCH] [aarch64] Implemented reciprocal square root (rsqrt) >> estimation in -ffast-math >> >> arch64 offers the instructions frsqrte and frsqrts, for rsqrt estimation > and >> a Newton-Raphson step, respectively. >> There are ARMv8 implementations where this is faster than using fdiv and >> rsqrt. >> It runs three steps for double and two steps for float to achieve the > needed >> precision. >> >> There is one caveat and open question. >> Since -ffast-math enables flush to zero intermediate values between >> approximation steps will be flushed to zero if they are denormal. >> E.g. This happens in the case of rsqrt (DBL_MAX) and rsqrtf (FLT_MAX). >> The test cases pass, but it is unclear to me whether this is expected >> behavior with -ffast-math. >> >> The patch applies to commit: >> svn+ssh://gcc.gnu.org/svn/gcc/trunk@224470 >> >> Please consider including this patch. >> Thank you and best regards, >> Benedikt Huber >> >> Benedikt Huber (1): >> 2015-06-15 Benedikt Huber <benedikt.hu...@theobroma-systems.com> >> >> gcc/ChangeLog | 9 +++ >> gcc/config/aarch64/aarch64-builtins.c | 60 ++++++++++++++++ >> gcc/config/aarch64/aarch64-protos.h | 2 + >> gcc/config/aarch64/aarch64-simd.md | 27 ++++++++ >> gcc/config/aarch64/aarch64.c | 63 +++++++++++++++++ >> gcc/config/aarch64/aarch64.md | 3 + >> gcc/testsuite/gcc.target/aarch64/rsqrt.c | 113 >> +++++++++++++++++++++++++++++++ >> 7 files changed, 277 insertions(+) >> create mode 100644 gcc/testsuite/gcc.target/aarch64/rsqrt.c >> >> -- >> 1.9.1 > <Mail Attachment.eml>