http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56766
Bug #: 56766
Summary: Fails to combine (vec_select (vec_concat ...)) to
(vec_merge ...)
Classification: Unclassified
Product: gcc
Version: 4.9.0
Status: UNCONFIRMED
Keywords: missed-optimization
Severity: normal
Priority: P3
Component: rtl-optimization
AssignedTo: [email protected]
ReportedBy: [email protected]
CC: [email protected]
Target: x86_64-*-*
With a patch to vectorize the pattern that should lead to the use of
(define_insn "sse3_addsubv2df3"
[(set (match_operand:V2DF 0 "register_operand" "=x,x")
(vec_merge:V2DF
(plus:V2DF
(match_operand:V2DF 1 "register_operand" "0,x")
(match_operand:V2DF 2 "nonimmediate_operand" "xm,xm"))
(minus:V2DF (match_dup 1) (match_dup 2))
(const_int 2)))]
"TARGET_SSE3"
this instruction fails to be generated because the GIMPLE
vect_var_.9_15 = vect_var_.5_22 + vect_var_.8_18;
vect_var_.10_14 = vect_var_.5_22 - vect_var_.8_18;
_2 = VEC_PERM_EXPR <vect_var_.9_15, vect_var_.10_14, { 0, 3 }>;
is expanded to
(insn 24 23 25 (set (reg:V2DF 80 [ vect_var_.9 ])
(plus:V2DF (reg:V2DF 76 [ vect_var_.5 ])
(reg:V2DF 75 [ vect_var_.8 ]))) t.c:7 -1
(nil))
(insn 25 24 27 (set (reg:V2DF 81 [ vect_var_.10 ])
(minus:V2DF (reg:V2DF 76 [ vect_var_.5 ])
(reg:V2DF 75 [ vect_var_.8 ]))) t.c:7 -1
(nil))
(insn 27 25 28 (set (reg:V2DF 82 [ D.1768 ])
(vec_select:V2DF (vec_concat:V4DF (reg:V2DF 80 [ vect_var_.9 ])
(reg:V2DF 81 [ vect_var_.10 ]))
(parallel [
(const_int 0 [0])
(const_int 3 [0x3])
]))) t.c:7 -1
(nil))
which does not match the pattern in the i386 backend.
The question is what should be the canonical form? Definitely vec_merge
is redundant and can always be replaced with (vec_select (vec_concat ...)).
Testcase w/o my vectorizer hack (compile with -O -msse3):
typedef double v2df __attribute__((vector_size(16)));
typedef long long v2di __attribute__((vector_size(16)));
v2df foo (v2df x, v2df y)
{
v2df tem1 = x + y;
v2df tem2 = x - y;
return __builtin_shuffle (tem1, tem2, (v2di) { 0, 3 });
}
VEC_MERGE is not used very often ...