On Tue, 19 Mar 2013, Richard Henderson wrote:
On 03/19/2013 08:47 AM, Marc Glisse wrote:
(define_insn_and_split "avx_<castmode><avxsizesuffix>_<castmode>"
[(set (match_operand:AVX256MODE2P 0 "nonimmediate_operand" "=x,m")
- (unspec:AVX256MODE2P
- [(match_operand:<ssehalfvecmode> 1 "nonimmediate_operand" "xm,x")]
- UNSPEC_CAST))]
+ (subreg:AVX256MODE2P
+ (match_operand:<ssehalfvecmode> 1 "nonimmediate_operand" "xm,x") 0))]
"TARGET_AVX"
"#"
"&& reload_completed"
[(const_int 0)]
I'm not fond of this, primarily because I believe the pattern should
not exist at all.
Sure, removing it would be even better.
One of the following is true:
(1) reload needs working around (thus all the reload_completed nonsense)
or
(2) the entire pattern is useless and would be subsumed by mov<mode>
or
(3) the entire pattern is useless and is *already* subsumed by
mov<mode>, since mov is earlier in the md file, making this
pattern dead code.
We need something to expand _mm256_castpd128_pd256 to. I tried making it a
define_expand (with the subreg pattern, and keeping the {} part intact),
but that gives check_rtl errors in lra. I then tried to remove the REG_P
condition and use simplify_gen_subreg or gen_lowpart, but the first one
gives unrecognizable insn at -O0 (same as removing the {} part completely)
(it seems happier at -O1), while the second ICEs (gen_lowpart_common
returns 0) for any -Ox except -O0. As must be obvious from this paragraph,
I just tried a few random bad ideas... and when none worked I posted the
minimal patch that worked.
Do you at least agree that vector-vector subregs make sense, or is that
part wrong as well?
--
Marc Glisse