On Tue, 19 Mar 2013, Richard Henderson wrote:

On 03/19/2013 08:47 AM, Marc Glisse wrote:
 (define_insn_and_split "avx_<castmode><avxsizesuffix>_<castmode>"
   [(set (match_operand:AVX256MODE2P 0 "nonimmediate_operand" "=x,m")
-       (unspec:AVX256MODE2P
-         [(match_operand:<ssehalfvecmode> 1 "nonimmediate_operand" "xm,x")]
-         UNSPEC_CAST))]
+       (subreg:AVX256MODE2P
+         (match_operand:<ssehalfvecmode> 1 "nonimmediate_operand" "xm,x") 0))]
   "TARGET_AVX"
   "#"
   "&& reload_completed"
   [(const_int 0)]

I'm not fond of this, primarily because I believe the pattern should
not exist at all.

Sure, removing it would be even better.

One of the following is true:

 (1) reload needs working around (thus all the reload_completed nonsense)
or
 (2) the entire pattern is useless and would be subsumed by mov<mode>
or
 (3) the entire pattern is useless and is *already* subsumed by
     mov<mode>, since mov is earlier in the md file, making this
     pattern dead code.

We need something to expand _mm256_castpd128_pd256 to. I tried making it a define_expand (with the subreg pattern, and keeping the {} part intact), but that gives check_rtl errors in lra. I then tried to remove the REG_P condition and use simplify_gen_subreg or gen_lowpart, but the first one gives unrecognizable insn at -O0 (same as removing the {} part completely) (it seems happier at -O1), while the second ICEs (gen_lowpart_common returns 0) for any -Ox except -O0. As must be obvious from this paragraph, I just tried a few random bad ideas... and when none worked I posted the minimal patch that worked.

Do you at least agree that vector-vector subregs make sense, or is that part wrong as well?

--
Marc Glisse

Reply via email to