Hi,

As PR92464 shows, the recent vectorization cost adjustment on load
insns is responsible for this regression.  It leads the profitable
min iteration count to change from 19 to 12.  The case happens to
hit the threshold.  By actual runtime performance evaluation, the
vectorized version perform on par with non vectorized version
(before).  So the vectorization on 12 is actually fine.  To keep
the case sensitive on high peeling cost, this patch is to adjust
the loop bound from 16 to 14.

Verified on ppc64-redhat-linux (BE P7) and powerpc64le-linux-gnu
(LE P8). 


BR,
Kewen

-----

gcc/testsuite/ChangeLog

2019-11-13  Kewen Lin  <li...@gcc.gnu.org>

        PR target/92464
        * gcc.dg/vect/costmodel/ppc/costmodel-vect-76b.c: Adjust
        loop bound due to load cost adjustment.


diff --git a/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-vect-76b.c 
b/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-vect-76b.c
index 4a7da2e..1bb064e 100644
--- a/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-vect-76b.c
+++ b/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-vect-76b.c
@@ -4,7 +4,7 @@
 #include <stdarg.h>
 #include "../../tree-vect.h"

-#define N 16
+#define N 14
 #define OFF 4

 /* Check handling of accesses for which the "initial condition" -

Reply via email to