Hi, While looking at a failure with regrename and mvectorize-with-neon-quad I noticed that the early-clobber in this vec_pack_trunc pattern is superfluous given that we can use reg_overlap_mentioned_p to decide in which order we want to emit these 2 instructions. While it works around the problem in regrename.c I still think that the behaviour in regrename is a bit suspicious and needs some more investigation.
Refer to my post on gcc@ for more on that particular case. http://gcc.gnu.org/ml/gcc/2011-08/msg00284.html I am currently running tests with Ira's patch of this morning http://gcc.gnu.org/ml/gcc-patches/2011-08/msg01304.html that turns on mvectorize-with-neon-quad by default to make sure there are no regressions. Will commit if no regressions. cheers Ramana 2011-08-16 Ramana Radhakrishnan <ramana.radhakrish...@linaro.org> * config/arm/neon.md (vec_pack_trunc_<mode> VN): Remove early-clobber. Adjust output template for overlap checks.
diff --git a/gcc/config/arm/neon.md b/gcc/config/arm/neon.md index 24dd941..06c699a 100644 --- a/gcc/config/arm/neon.md +++ b/gcc/config/arm/neon.md @@ -5631,14 +5631,19 @@ ; the semantics of the instructions require. (define_insn "vec_pack_trunc_<mode>" - [(set (match_operand:<V_narrow_pack> 0 "register_operand" "=&w") + [(set (match_operand:<V_narrow_pack> 0 "register_operand" "=w") (vec_concat:<V_narrow_pack> (truncate:<V_narrow> (match_operand:VN 1 "register_operand" "w")) (truncate:<V_narrow> (match_operand:VN 2 "register_operand" "w"))))] "TARGET_NEON && !BYTES_BIG_ENDIAN" - "vmovn.i<V_sz_elem>\t%e0, %q1\;vmovn.i<V_sz_elem>\t%f0, %q2" + { + if (reg_overlap_mentioned_p (operands[0], operands[1])) + return "vmovn.i<V_sz_elem>\t%e0, %q1\;vmovn.i<V_sz_elem>\t%f0, %q2"; + else + return "vmovn.i<V_sz_elem>\t%f0, %q2\;vmovn.i<V_sz_elem>\t%e0, %q1"; + } [(set_attr "neon_type" "neon_shift_1") (set_attr "length" "8")] )