https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109153

--- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> ---
On the GIMPLE side we should canonicalize here I think, at which point
inserts into a splatted vector become more profitable depends?

  _4 = VEC_PERM_EXPR <a_2(D), b_3(D), { 0, 8, 1, 9, 2, 10, 3, 11 }>;
  _5 = VEC_PERM_EXPR <a_2(D), b_3(D), { 4, 12, 5, 13, 6, 14, 7, 15 }>;
  _6 = {_4, _5};

we have simplify_vector_constructor in tree-ssa-forwprop.cc.

For the other BIT_INSERT_EXPR case I'd go to match.pd, but adding a function
to forwprop is also possible.

If we want to expand { 4, 4, _1, 4, 4, ..} with splat + insert we should
IMHO do that at RTL expansion time where we already try splat (I think).
Not sure how to apply costing there though.  There's also the possibility
to expand { a, a, b, b, a, b, a, ... } with two splat + blend.  For
vec_init RTL expansion the target has full control, so it can decide for
itself (if we do not want to do anything in generic code).

Reply via email to