eSi-RISC has vector permute functionality, but no unaligned loads. We see execution failures on gcc.dg/vect/slp-perm-12.c because loop versioning is used to make the tptr aligned for the first loop iteration, and then with a step of originally 11, 22 after vectorization, and a vector alignment of 8 bytes, the second iteration causes an AlignmentError exception. The attached patch to tree-vect-data-refs.c suppresses attempts to align data accesses where the step alignment times the vectorization factor is insufficient to sustain the alignment during the loop.
Bootstrapped and regression tested on x86_64-pc-linux-gnu .

I have also attached a matching testsuite patch to not expect SLP vectorization for slp-perm-12 when no unaligned loads are available, although in terms of testing, I can only say that it works for us.
2018-12-15  Joern Rennecke  <joern.renne...@riscy-ip.com>

        * tree-vect-data-refs.c (vect_enhance_data_refs_alignment): Don't do
        versioning for data accesses with misaligned step.

Index: tree-vect-data-refs.c
===================================================================
--- tree-vect-data-refs.c       (revision 267262)
+++ tree-vect-data-refs.c       (working copy)
@@ -2160,6 +2160,20 @@ vect_enhance_data_refs_alignment (loop_v
                  break;
                }
 
+             /* Forcing alignment in the first iteration is no good if
+                we don't keep it across iterations.  For now, just disable
+                versioning in this case.
+                ?? We could actually unroll the loop to archive the required
+                overall step alignemnt, and forcing the alignment could be
+                done by doing some iterations of the non-vectorized loop.  */
+             if (maybe_lt (LOOP_VINFO_VECT_FACTOR (loop_vinfo)
+                           * DR_STEP_ALIGNMENT (dr),
+                           TYPE_ALIGN_UNIT (vectype)))
+               {
+                 do_versioning = false;
+                 break;
+               }
+
               /* The rightmost bits of an aligned address must be zeros.
                  Construct the mask needed for this test.  For example,
                  GET_MODE_SIZE for the vector mode V4SI is 16 bytes so the
2018-12-15  Joern Rennecke  <joern.renne...@riscy-ip.com>

        * testsuite/gcc.dg/vect/slp-perm-12.c (dg-final): Don't expect SLP
        vectorization for ! vect_no_align.

Index: testsuite/gcc.dg/vect/slp-perm-12.c
===================================================================
--- testsuite/gcc.dg/vect/slp-perm-12.c (revision 5616)
+++ testsuite/gcc.dg/vect/slp-perm-12.c (revision 5617)
@@ -49,4 +49,4 @@ int main()
   return 0;
 }
 
-/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 1 "vect" { 
target vect_perm } } } */
+/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 1 "vect" { 
target { vect_perm  && {! vect_no_align } } } } } */

Reply via email to