On Fri, Jun 03, 2016 at 04:50:16PM -0500, Evandro Menezes wrote: > >>+ return false; > >>- emit_insn ((*get_rsqrte_type (mode)) (x0, xsrc)); > >>+ rtx xmsk = gen_reg_rtx (mmsk); > >>+ if (!recp) > >>+ /* When calculating the approximate square root, compare the argument > >>with > >>+ 0.0 and create a mask. */ > >>+ emit_insn (gen_rtx_SET (xmsk, gen_rtx_NEG (mmsk, gen_rtx_EQ (mmsk, src, > >>+ CONST0_RTX (mode))))); > >I guess you've done it this way rather than calling gen_aarch64_cmeq<mode> > >directly to avoid having a switch on mode? I wonder whether it is worth just > >writing that helper function to make it explicit what instruction we want > >to match? > > I prefer to avoid calling the gen_...() functions for forward > portability. If a future version of the ISA can do it better than > the explicit gen_...() function, then this just works. Or at least > this is the hope. Again, this is just me.
I prefer calling the gen functions, in the hope that those patterns would be "upgraded" to cover the new ISA versions. But, I can see your argument so I'm happy to drop this comment. > @@ -7369,10 +7372,10 @@ aarch64_builtin_reciprocal (tree fndecl) > > typedef rtx (*rsqrte_type) (rtx, rtx); > > -/* Select reciprocal square root initial estimate > - insn depending on machine mode. */ > +/* Select reciprocal square root initial estimate insn depending on machine > + mode. */ > > -rsqrte_type > +static rsqrte_type > get_rsqrte_type (machine_mode mode) > { > switch (mode) > @@ -7382,16 +7385,15 @@ get_rsqrte_type (machine_mode mode) > case V2DFmode: return gen_aarch64_rsqrte_v2df2; > case V2SFmode: return gen_aarch64_rsqrte_v2sf2; > case V4SFmode: return gen_aarch64_rsqrte_v4sf2; > - default: gcc_unreachable (); > + default: gcc_unreachable (); > } > } > > typedef rtx (*rsqrts_type) (rtx, rtx, rtx); > > -/* Select reciprocal square root Newton-Raphson step > - insn depending on machine mode. */ > +/* Select reciprocal square root series step insn depending on machine mode. > */ > > -rsqrts_type > +static rsqrts_type > get_rsqrts_type (machine_mode mode) > { > switch (mode) > @@ -7401,50 +7403,88 @@ get_rsqrts_type (machine_mode mode) > case V2DFmode: return gen_aarch64_rsqrts_v2df3; > case V2SFmode: return gen_aarch64_rsqrts_v2sf3; > case V4SFmode: return gen_aarch64_rsqrts_v4sf3; > - default: gcc_unreachable (); > + default: gcc_unreachable (); > } > } You'll find these two hunks hit a merge conflict on trunk after Jiong's recent changes to these pattern names. Just be careful when applying the patch. The patch is OK for trunk. Thanks, James > From 5c5c07f38cb06507fe997a890dfc5bae1d3179f6 Mon Sep 17 00:00:00 2001 > From: Evandro Menezes <e.mene...@samsung.com> > Date: Mon, 4 Apr 2016 11:23:29 -0500 > Subject: [PATCH 2/3] [AArch64] Emit square root using the Newton series > > 2016-04-04 Evandro Menezes <e.mene...@samsung.com> > Wilco Dijkstra <wilco.dijks...@arm.com> > > gcc/ > * config/aarch64/aarch64-protos.h > (aarch64_emit_approx_rsqrt): Replace with new function > "aarch64_emit_approx_sqrt". > (cpu_approx_modes): New member "sqrt". > * config/aarch64/aarch64.c > (generic_approx_modes): New member "sqrt". > (exynosm1_approx_modes): Likewise. > (xgene1_approx_modes): Likewise. > (aarch64_emit_approx_rsqrt): Replace with new function > "aarch64_emit_approx_sqrt". > (aarch64_override_options_after_change_1): Handle new option. > * config/aarch64/aarch64-simd.md > (rsqrt<mode>2): Use new function instead. > (sqrt<mode>2): New expansion and insn definitions. > * config/aarch64/aarch64.md: Likewise. > * config/aarch64/aarch64.opt > (mlow-precision-sqrt): Add new option description. > * doc/invoke.texi (mlow-precision-sqrt): Likewise.