On Tue, Jul 13, 2021 at 09:30:43PM +0200, Jakub Jelinek via Gcc-patches wrote: > The following gcc.dg/pr101384.c testcase is miscompiled on > powerpc64le-linux. > easy_altivec_constant has code to try construct vector constants with > different element sizes, perhaps different from CONST_VECTOR's mode. But as > written, that works fine for vspltis[bhw] cases, but not for the vspltisw > x,-1; vsl[bhw] x,x,x case, because that creates always a V16QImode, V8HImode > or V4SImode constant containing broadcasted constant with just the MSB set. > The vspltis_constant function etc. expects the vspltis[bhw] instructions > where the small [-16..15] or even [-32..30] constant is sign-extended to the > remaining step bytes, but that is not the case for the 0x80...00 constants, > with step > 1 we can't handle e.g. > { 0x80, 0xff, 0xff, 0xff, 0x80, 0xff, 0xff, 0xff, 0x80, 0xff, 0xff, 0xff, > 0x80, 0xff, 0xff, 0xff } > vectors but do want to handle e.g. > { 0, 0, 0, 0x80, 0, 0, 0, 0x80, 0, 0, 0, 0x80, 0, 0, 0, 0x80 } > and similarly with copies > 1 we do want to handle e.g. > { 0x80808080, 0x80808080, 0x80808080, 0x80808080 }. > > Bootstrapped/regtested on powerpc64le-linux and powerpc64-linux (the latter > regtested with -m32/-m64), ok for trunk? > > Perhaps for backports it would be best to limit the EASY_VECTOR_MSB case > matching to step == 1 && copies == 1, because that is the only case the > splitter handled correctly, but as can be seen in the gcc.target tests, the > patch tries to handle it for all the cases. Do you want that other patch > or prefer this patch for the backports too? > > 2021-07-13 Jakub Jelinek <ja...@redhat.com> > > PR target/101384 > * config/rs6000/rs6000-protos.h (easy_altivec_constant): Change return > type from bool to int. > * config/rs6000/rs6000.c (vspltis_constant): Fix up handling the > EASY_VECTOR_MSB case if either step or copies is not 1. > (vspltis_shifted): Fix comment typo. > (easy_altivec_constant): Change return type from bool to int, instead > of returning true return byte size of the element mode that should be > used to synthetize the constant. > * config/rs6000/predicates.md (easy_vector_constant_msb): Require > that vspltis_shifted is 0, handle the case where easy_altivec_constant > assumes using different vector mode from CONST_VECTOR's mode. > * config/rs6000/altivec.md (easy_vector_constant_msb splitter): Use > easy_altivec_constant to determine mode in which -1 >> -1 should be > performed, use rs6000_expand_vector_init instead of gen_vec_initv4sisi. > > * gcc.dg/pr101384.c: New test. > * gcc.target/powerpc/pr101384-1.c: New test. > * gcc.target/powerpc/pr101384-2.c: New test.
I'd like to ping this patch. For gcc 11, I've bootstrapped/regtested on powerpc64le-linux and powerpc64-linux (the latter regtested -m32/-m64) also a simpler version below, which restricts it to the case that the code handles properly. 2021-07-20 Jakub Jelinek <ja...@redhat.com> PR target/101384 * config/rs6000/rs6000.c (vspltis_constant): Accept EASY_VECTOR_MSB only if step and copies are equal to 1. * gcc.dg/pr101384.c: New test. --- gcc/config/rs6000/rs6000.c.jj 2021-07-18 12:50:43.816219546 +0200 +++ gcc/config/rs6000/rs6000.c 2021-07-20 10:46:23.880632997 +0200 @@ -6144,7 +6144,7 @@ vspltis_constant (rtx op, unsigned step, /* Also check if are loading up the most significant bit which can be done by loading up -1 and shifting the value left by -1. */ - else if (EASY_VECTOR_MSB (splat_val, inner)) + else if (EASY_VECTOR_MSB (splat_val, inner) && step == 1 && copies == 1) ; else --- gcc/testsuite/gcc.dg/pr101384.c.jj 2021-07-20 10:45:22.828486154 +0200 +++ gcc/testsuite/gcc.dg/pr101384.c 2021-07-20 10:45:22.828486154 +0200 @@ -0,0 +1,39 @@ +/* PR target/101384 */ +/* { dg-do run } */ +/* { dg-options "-O2 -Wno-psabi -w" } */ + +typedef unsigned char __attribute__((__vector_size__ (16))) U; +typedef unsigned short __attribute__((__vector_size__ (8 * sizeof (short)))) V; + +U u; +V v; + +__attribute__((noipa)) U +foo (void) +{ + U y = (U) { 0x80, 0xff, 0xff, 0xff, 0x80, 0xff, 0xff, 0xff, + 0x80, 0xff, 0xff, 0xff, 0x80, 0xff, 0xff, 0xff } + u; + return y; +} + +__attribute__((noipa)) V +bar (void) +{ + V y = (V) { 0x8000, 0xffff, 0x8000, 0xffff, + 0x8000, 0xffff, 0x8000, 0xffff } + v; + return y; +} + +int +main () +{ + U x = foo (); + for (unsigned i = 0; i < 16; i++) + if (x[i] != ((i & 3) ? 0xff : 0x80)) + __builtin_abort (); + V y = bar (); + for (unsigned i = 0; i < 8; i++) + if (y[i] != ((i & 1) ? 0xffff : 0x8000)) + __builtin_abort (); + return 0; +} Jakub