https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88713
--- Comment #37 from rguenther at suse dot de <rguenther at suse dot de> --- On Wed, 23 Jan 2019, hjl.tools at gmail dot com wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88713 > > --- Comment #36 from H.J. Lu <hjl.tools at gmail dot com> --- > (In reply to Richard Biener from comment #34) > > GCC definitely fails to see the FMA use as opportunity in > > ix86_emit_swsqrtsf, the a == 0 checking is because of the missing > > expander w/o avx512er where we could still use the NR sequence > > with the other instruction. HJ? > > Like this? Yes. The lack of an expander for the rqsrt operation is probably more severe though (causing sqrt + approx recip to appear) > diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c > index e0d7c74fcec..0bbe3772ab7 100644 > --- a/gcc/config/i386/i386.c > +++ b/gcc/config/i386/i386.c > @@ -44855,14 +44855,22 @@ void ix86_emit_swsqrtsf (rtx res, rtx a, > machine_mode > mode, bool recip) > } > } > > + mthree = force_reg (mode, mthree); > + > /* e0 = x0 * a */ > emit_insn (gen_rtx_SET (e0, gen_rtx_MULT (mode, x0, a))); > - /* e1 = e0 * x0 */ > - emit_insn (gen_rtx_SET (e1, gen_rtx_MULT (mode, e0, x0))); > > - /* e2 = e1 - 3. */ > - mthree = force_reg (mode, mthree); > - emit_insn (gen_rtx_SET (e2, gen_rtx_PLUS (mode, e1, mthree))); > + if (TARGET_FMA || TARGET_AVX512F) > + emit_insn (gen_rtx_SET (e2, > + gen_rtx_FMA (mode, e0, x0, mthree))); > + else > + { > + /* e1 = e0 * x0 */ > + emit_insn (gen_rtx_SET (e1, gen_rtx_MULT (mode, e0, x0))); > + > + /* e2 = e1 - 3. */ > + emit_insn (gen_rtx_SET (e2, gen_rtx_PLUS (mode, e1, mthree))); > + } > > mhalf = force_reg (mode, mhalf); > if (recip)