On Tue, Jul 13, 2021 at 09:30:43PM +0200, Jakub Jelinek via Gcc-patches wrote:
> The following gcc.dg/pr101384.c testcase is miscompiled on
> powerpc64le-linux.
> easy_altivec_constant has code to try construct vector constants with
> different element sizes, perhaps different from CONST_VECTOR's mode.  But as
> written, that works fine for vspltis[bhw] cases, but not for the vspltisw
> x,-1; vsl[bhw] x,x,x case, because that creates always a V16QImode, V8HImode
> or V4SImode constant containing broadcasted constant with just the MSB set. 
> The vspltis_constant function etc. expects the vspltis[bhw] instructions
> where the small [-16..15] or even [-32..30] constant is sign-extended to the
> remaining step bytes, but that is not the case for the 0x80...00 constants,
> with step > 1 we can't handle e.g.
> { 0x80, 0xff, 0xff, 0xff, 0x80, 0xff, 0xff, 0xff, 0x80, 0xff, 0xff, 0xff, 
> 0x80, 0xff, 0xff, 0xff }
> vectors but do want to handle e.g.
> { 0, 0, 0, 0x80, 0, 0, 0, 0x80, 0, 0, 0, 0x80, 0, 0, 0, 0x80 }
> and similarly with copies > 1 we do want to handle e.g.
> { 0x80808080, 0x80808080, 0x80808080, 0x80808080 }.
> 
> Bootstrapped/regtested on powerpc64le-linux and powerpc64-linux (the latter
> regtested with -m32/-m64), ok for trunk?
> 
> Perhaps for backports it would be best to limit the EASY_VECTOR_MSB case
> matching to step == 1 && copies == 1, because that is the only case the
> splitter handled correctly, but as can be seen in the gcc.target tests, the
> patch tries to handle it for all the cases.  Do you want that other patch
> or prefer this patch for the backports too?
> 
> 2021-07-13  Jakub Jelinek  <ja...@redhat.com>
> 
>       PR target/101384
>       * config/rs6000/rs6000-protos.h (easy_altivec_constant): Change return
>       type from bool to int.
>       * config/rs6000/rs6000.c (vspltis_constant): Fix up handling the
>       EASY_VECTOR_MSB case if either step or copies is not 1.
>       (vspltis_shifted): Fix comment typo.
>       (easy_altivec_constant): Change return type from bool to int, instead
>       of returning true return byte size of the element mode that should be
>       used to synthetize the constant.
>       * config/rs6000/predicates.md (easy_vector_constant_msb): Require
>       that vspltis_shifted is 0, handle the case where easy_altivec_constant
>       assumes using different vector mode from CONST_VECTOR's mode.
>       * config/rs6000/altivec.md (easy_vector_constant_msb splitter): Use
>       easy_altivec_constant to determine mode in which -1 >> -1 should be
>       performed, use rs6000_expand_vector_init instead of gen_vec_initv4sisi.
> 
>       * gcc.dg/pr101384.c: New test.
>       * gcc.target/powerpc/pr101384-1.c: New test.
>       * gcc.target/powerpc/pr101384-2.c: New test.

I'd like to ping this patch.

For gcc 11, I've bootstrapped/regtested on powerpc64le-linux and
powerpc64-linux (the latter regtested -m32/-m64) also a simpler version
below, which restricts it to the case that the code handles properly.

2021-07-20  Jakub Jelinek  <ja...@redhat.com>

        PR target/101384
        * config/rs6000/rs6000.c (vspltis_constant): Accept EASY_VECTOR_MSB
        only if step and copies are equal to 1.

        * gcc.dg/pr101384.c: New test.

--- gcc/config/rs6000/rs6000.c.jj       2021-07-18 12:50:43.816219546 +0200
+++ gcc/config/rs6000/rs6000.c  2021-07-20 10:46:23.880632997 +0200
@@ -6144,7 +6144,7 @@ vspltis_constant (rtx op, unsigned step,
 
   /* Also check if are loading up the most significant bit which can be done by
      loading up -1 and shifting the value left by -1.  */
-  else if (EASY_VECTOR_MSB (splat_val, inner))
+  else if (EASY_VECTOR_MSB (splat_val, inner) && step == 1 && copies == 1)
     ;
 
   else
--- gcc/testsuite/gcc.dg/pr101384.c.jj  2021-07-20 10:45:22.828486154 +0200
+++ gcc/testsuite/gcc.dg/pr101384.c     2021-07-20 10:45:22.828486154 +0200
@@ -0,0 +1,39 @@
+/* PR target/101384 */
+/* { dg-do run } */
+/* { dg-options "-O2 -Wno-psabi -w" } */
+
+typedef unsigned char __attribute__((__vector_size__ (16))) U;
+typedef unsigned short __attribute__((__vector_size__ (8 * sizeof (short)))) V;
+
+U u;
+V v;
+
+__attribute__((noipa)) U
+foo (void)
+{
+  U y = (U) { 0x80, 0xff, 0xff, 0xff, 0x80, 0xff, 0xff, 0xff,
+              0x80, 0xff, 0xff, 0xff, 0x80, 0xff, 0xff, 0xff } + u;
+  return y;
+}
+
+__attribute__((noipa)) V
+bar (void)
+{
+  V y = (V) { 0x8000, 0xffff, 0x8000, 0xffff,
+              0x8000, 0xffff, 0x8000, 0xffff } + v;
+  return y;
+}
+
+int
+main ()
+{
+  U x = foo ();
+  for (unsigned i = 0; i < 16; i++)
+    if (x[i] != ((i & 3) ? 0xff : 0x80))
+      __builtin_abort ();
+  V y = bar ();
+  for (unsigned i = 0; i < 8; i++)
+    if (y[i] != ((i & 1) ? 0xffff : 0x8000))
+      __builtin_abort ();
+  return 0;
+}


        Jakub

Reply via email to