https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92819
--- Comment #4 from rguenther at suse dot de <rguenther at suse dot de> --- On Thu, 5 Dec 2019, rsandifo at gcc dot gnu.org wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92819 > > --- Comment #2 from rsandifo at gcc dot gnu.org <rsandifo at gcc dot gnu.org> > --- > (In reply to Richard Biener from comment #1) > > Richard - when we have > > > > _7 = { _2, _2 }; > > VEC_PERM <x_3(D), _7, { 0, 3 }> > > > > then we somehow run into > > > > /* See if the permutation is performing a single element > > insert from a CONSTRUCTOR or constant and use a BIT_INSERT_EXPR > > in that case. But only if the vector mode is supported, > > otherwise this is invalid GIMPLE. */ > > if (TYPE_MODE (type) != BLKmode > > && (TREE_CODE (cop0) == VECTOR_CST > > || TREE_CODE (cop0) == CONSTRUCTOR > > || TREE_CODE (cop1) == VECTOR_CST > > || TREE_CODE (cop1) == CONSTRUCTOR)) > > { > > if (sel.series_p (1, 1, nelts + 1, 1)) > > { > > /* After canonicalizing the first elt to come from the > > first vector we only can insert the first elt from > > the first vector. */ > > at = 0; > > if ((ins = fold_read_from_vector (cop0, sel[0]))) > > op0 = op1; > > > > but of course cop0 isn't something we can simplify. So - why can > > we only insert the first elt from the first vector? > > This is because the code has already canonicalised the order > of the oerands so that the first element of the result comes > from the first vector: > > else if (known_ge (poly_uint64 (sel[0]), nelts)) > { > std::swap (op0, op1); > sel.rotate_inputs (1); > } > > So if we're inserting into the second lane or later, the > inserted element comes from the second vector. OK, so indeed if we cannot perform the optimization with an insert to lane zero then we can consider inserting to lane two. So I conclude my check is good. > Don't know if that answers the rest of the question too. Guess it works for me. Will test a patch then.x