Hi,

In little endian mode, we managed to convert a load of the V4SI vector
{3, 3, 3, 7} into a vspltisw of 3, apparently taking offense at the
number 7.  It turns out we only looked at the first N-1 elements of an
N-element vector in little endian mode, and verified the zeroth element
twice.  Adjusting the loop boundaries fixes the problem.

Currently bootstrapping for powerpc64{,le}-unknown-linux-gnu.  Ok to
commit to trunk if no regressions?

Thanks,
Bill


2013-10-18  Bill Schmidt  <wschm...@linux.vnet.ibm.com>

        * config/rs6000/rs6000.c (vspltis_constant): Make sure we check
        all elements for both endian flavors.


Index: gcc/config/rs6000/rs6000.c
===================================================================
@@ -4932,6 +4932,8 @@ vspltis_constant (rtx op, unsigned step, unsigned
   unsigned nunits;
   unsigned bitsize;
   unsigned mask;
+  unsigned start;
+  unsigned end;
 
   HOST_WIDE_INT val;
   HOST_WIDE_INT splat_val;
@@ -4981,7 +4983,10 @@ vspltis_constant (rtx op, unsigned step, unsigned
 
   /* Check if VAL is present in every STEP-th element, and the
      other elements are filled with its most significant bit.  */
-  for (i = 0; i < nunits - 1; ++i)
+  start = BYTES_BIG_ENDIAN ? 0 : 1;
+  end = BYTES_BIG_ENDIAN ? nunits - 1 : nunits;
+
+  for (i = start; i < end; ++i)
     {
       HOST_WIDE_INT desired_val;
       if (((BYTES_BIG_ENDIAN ? i + 1 : i) & (step - 1)) == 0)


Reply via email to