Hi! We have vec_initv4tiv2ti and vec_initv2titi patterns which call ix86_expand_vector_init and assume it works for those modes. For the case of construction from two half-sized vectors, the code assumes it will always succeed, but we have only insn patterns with SImode and DImode element types. QImode and HImode element types are already handled by performing it with same sized vectors with SImode elements and the following patch extends that to V*TImode vectors.
Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2021-06-04 Jakub Jelinek <ja...@redhat.com> PR target/100887 * config/i386/i386-expand.c (ix86_expand_vector_init): Handle concatenation from half-sized modes with TImode elements. * gcc.target/i386/pr100887.c: New test. --- gcc/config/i386/i386-expand.c.jj 2021-05-28 11:03:19.424885281 +0200 +++ gcc/config/i386/i386-expand.c 2021-06-03 12:30:44.263286549 +0200 @@ -14610,11 +14610,15 @@ ix86_expand_vector_init (bool mmx_ok, rt if (GET_MODE_NUNITS (GET_MODE (x)) * 2 == n_elts) { rtx ops[2] = { XVECEXP (vals, 0, 0), XVECEXP (vals, 0, 1) }; - if (inner_mode == QImode || inner_mode == HImode) + if (inner_mode == QImode + || inner_mode == HImode + || inner_mode == TImode) { unsigned int n_bits = n_elts * GET_MODE_SIZE (inner_mode); - mode = mode_for_vector (SImode, n_bits / 4).require (); - inner_mode = mode_for_vector (SImode, n_bits / 8).require (); + scalar_mode elt_mode = inner_mode == TImode ? DImode : SImode; + n_bits /= GET_MODE_SIZE (elt_mode); + mode = mode_for_vector (elt_mode, n_bits).require (); + inner_mode = mode_for_vector (elt_mode, n_bits / 2).require (); ops[0] = gen_lowpart (inner_mode, ops[0]); ops[1] = gen_lowpart (inner_mode, ops[1]); subtarget = gen_reg_rtx (mode); --- gcc/testsuite/gcc.target/i386/pr100887.c.jj 2021-06-03 12:44:09.653939987 +0200 +++ gcc/testsuite/gcc.target/i386/pr100887.c 2021-06-03 12:43:36.580404322 +0200 @@ -0,0 +1,13 @@ +/* PR target/100887 */ +/* { dg-do compile { target int128 } } */ +/* { dg-options "-mavx512f" } */ + +typedef unsigned __int128 U __attribute__((__vector_size__ (64))); +typedef unsigned __int128 V __attribute__((__vector_size__ (32))); +typedef unsigned __int128 W __attribute__((__vector_size__ (16))); + +W +foo (U u, V v) +{ + return __builtin_shufflevector (u, v, 0); +} Jakub