On 26/03/12 11:14, Andrew Stubbs wrote: > On 28/02/12 17:45, Andrew Stubbs wrote: >> Hi all, >> >> This patch adds a DImode negate pattern for NEON. >> >> Unfortunately, the NEON vneg instruction only supports vectors, not >> singletons, so there's no direct way to do it in DImode, and the >> compiler ends up moving the value back to core registers, negating it, >> and returning to NEON afterwards: >> >> fmrrd r2, r3, d16 @ int >> negs r2, r2 >> sbc r3, r3, r3, lsl #1 >> fmdrr d16, r2, r3 @ int >> >> The new patch does it entirely in NEON: >> >> vmov.i32 d17, #0 @ di >> vsub.i64 d16, d17, d16 >> >> (Note that this is the result when combined with my recent patch for >> NEON DImode immediates. Without that you get a constant pool load.) > > This updates fixes a bootstrap failure caused by an early clobber error. > I've also got a native regression test running now. > > OK? > > Andrew > > > neon-neg64.patch > > > 2012-03-26 Andrew Stubbs <a...@codesourcery.com> > > gcc/ > * config/arm/arm.md (negdi2): Use gen_negdi2_neon. > * config/arm/neon.md (negdi2_neon): New insn. > Also add splitters for core and NEON registers. > > --- > gcc/config/arm/arm.md | 8 +++++++- > gcc/config/arm/neon.md | 37 +++++++++++++++++++++++++++++++++++++ > 2 files changed, 44 insertions(+), 1 deletions(-) > > diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md > index 751997f..f1dbbf7 100644 > --- a/gcc/config/arm/arm.md > +++ b/gcc/config/arm/arm.md > @@ -4048,7 +4048,13 @@ > (neg:DI (match_operand:DI 1 "s_register_operand" ""))) > (clobber (reg:CC CC_REGNUM))])] > "TARGET_EITHER" > - "" > + { > + if (TARGET_NEON) > + { > + emit_insn (gen_negdi2_neon (operands[0], operands[1])); > + DONE; > + } > + } > ) > > ;; The constraints here are to prevent a *partial* overlap (where %Q0 == > %R1). > diff --git a/gcc/config/arm/neon.md b/gcc/config/arm/neon.md > index 3c88568..bf229a7 100644 > --- a/gcc/config/arm/neon.md > +++ b/gcc/config/arm/neon.md > @@ -922,6 +922,43 @@ > (const_string "neon_int_3")))] > ) > > +(define_insn "negdi2_neon" > + [(set (match_operand:DI 0 "s_register_operand" "= w,?r,?&r,?w") > + (neg:DI (match_operand:DI 1 "s_register_operand" " w, 0, r, w"))) > + (clobber (match_scratch:DI 2 "=&w, X, > X,&w")) > + (clobber (reg:CC CC_REGNUM))] > + "TARGET_NEON" > + "#" > + [(set_attr "length" "8") > + (set_attr "arch" "nota8,*,*,onlya8")] > +) > +
If negation in Neon needs a scratch register, it seems to me to be somewhat odd that we're disparaging the ARM version. Also, wouldn't it be sensible to support a variant that was early-clobber on operand 0, but loaded immediate zero into that value first: vmov Dd, #0 vsub Dd, Dd, Dm That way you'll never need more than two registers, whereas today you want three. R.