https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92819
Richard Biener <rguenth at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |rsandifo at gcc dot gnu.org --- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> --- Richard - when we have _7 = { _2, _2 }; VEC_PERM <x_3(D), _7, { 0, 3 }> then we somehow run into /* See if the permutation is performing a single element insert from a CONSTRUCTOR or constant and use a BIT_INSERT_EXPR in that case. But only if the vector mode is supported, otherwise this is invalid GIMPLE. */ if (TYPE_MODE (type) != BLKmode && (TREE_CODE (cop0) == VECTOR_CST || TREE_CODE (cop0) == CONSTRUCTOR || TREE_CODE (cop1) == VECTOR_CST || TREE_CODE (cop1) == CONSTRUCTOR)) { if (sel.series_p (1, 1, nelts + 1, 1)) { /* After canonicalizing the first elt to come from the first vector we only can insert the first elt from the first vector. */ at = 0; if ((ins = fold_read_from_vector (cop0, sel[0]))) op0 = op1; but of course cop0 isn't something we can simplify. So - why can we only insert the first elt from the first vector? Is series_p falsely triggering because both vectors are "series_p"? That said, we can insert both ways but only one way succeeds in the end. The code also looks like it would fail for V1m vectors since then the base element number is out of range. That is, does the above even make sense for nelts <= 2? This is v2df qux (v2df x, double *p) { return (v2df) { x[0], *p }; } it works for v2df qux (v2df x, double *p) { return (v2df) { *p, x[1] }; } when guarding the special case with TREE_CODE (cop0) == VECTOR_CST || TREE_CODE (cop0) == CONSTRUCTOR the later code runs into bool vec_perm_indices::series_p (unsigned int out_base, unsigned int out_step, element_type in_base, element_type in_step) const { /* Check the base value. */ if (maybe_ne (clamp (m_encoding.elt (out_base)), clamp (in_base))) return false; and thus doesn't handle insertion at the very last element? I can "fix" that by doing @@ -6047,9 +6049,11 @@ (define_operator_list COND_TERNARY for (at = 0; at < encoded_nelts; ++at) if (maybe_ne (sel[at], at)) break; - if (at < encoded_nelts && sel.series_p (at + 1, 1, at + 1, 1)) + if (at < encoded_nelts + && (known_eq (at + 1, nelts) + || sel.series_p (at + 1, 1, at + 1, 1))) { maybe the earlier series_p query needs to be adjusted similarly? Or do you think that @@ -6032,7 +6032,9 @@ (define_operator_list COND_TERNARY || TREE_CODE (cop1) == VECTOR_CST || TREE_CODE (cop1) == CONSTRUCTOR)) { - if (sel.series_p (1, 1, nelts + 1, 1)) + if (sel.series_p (1, 1, nelts + 1, 1) + && (TREE_CODE (cop0) == VECTOR_CST + || TREE_CODE (cop0) == CONSTRUCTOR)) { /* After canonicalizing the first elt to come from the first vector we only can insert the first elt from is fine?