GCC 6 and 7 would vectorise: void f (unsigned long incx, unsigned long incy, float *restrict dx, float *restrict dy) { unsigned long ix = 0, iy = 0; for (unsigned long i = 0; i < 512; ++i) { dy[iy] += dx[ix]; ix += incx; iy += incy; } }
without first proving that incy is nonzero. This is a regression from GCC 5. It was fixed on trunk in r223486, which versioned the loop based on whether incy is zero, but that's obviously too invasive to backport. This patch instead bails out for non-constant steps in the place that trunk would try a check for zeroness. I did wonder about trying to use range information to prove nonzeroness for SSA_NAMEs, but that would be entirely new code and didn't seem suitable for a release branch. Tested on aarch64-linux-gnu. OK for GCC 7 and 6? I'll add the testcase to trunk too. Richard 2018-02-28 Richard Sandiford <richard.sandif...@linaro.org> gcc/ PR tree-optimization/84485 * tree-vect-data-refs.c (vect_analyze_data_ref_dependence): Return true for zero dependence distances if either step is variable, and if there is no metadata that guarantees correctness. gcc/testsuite/ PR tree-optimization/84485 * gcc.dg/vect/pr84485.c: New test. Index: gcc/tree-vect-data-refs.c =================================================================== --- gcc/tree-vect-data-refs.c 2017-07-27 18:08:43.779978373 +0100 +++ gcc/tree-vect-data-refs.c 2018-02-28 14:16:36.621113244 +0000 @@ -394,6 +394,16 @@ vect_analyze_data_ref_dependence (struct } } + if (!loop->force_vectorize + && (TREE_CODE (DR_STEP (dra)) != INTEGER_CST + || TREE_CODE (DR_STEP (drb)) != INTEGER_CST)) + { + if (dump_enabled_p ()) + dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, + "step could be zero.\n"); + return true; + } + continue; } Index: gcc/testsuite/gcc.dg/vect/pr84485.c =================================================================== --- /dev/null 2018-02-26 10:26:41.624847060 +0000 +++ gcc/testsuite/gcc.dg/vect/pr84485.c 2018-02-28 14:16:36.620112862 +0000 @@ -0,0 +1,34 @@ +/* { dg-do run } */ + +#include "tree-vect.h" + +#define N 256 + +void __attribute__ ((noinline, noclone)) +f (unsigned long incx, unsigned long incy, + float *restrict dx, float *restrict dy) +{ + unsigned long ix = 0, iy = 0; + for (unsigned long i = 0; i < N; ++i) + { + dy[iy] += dx[ix]; + ix += incx; + iy += incy; + } +} + +float a = 0.0; +float b[N]; + +int +main (void) +{ + check_vect (); + + for (int i = 0; i < N; ++i) + b[i] = i; + f (1, 0, b, &a); + if (a != N * (N - 1) / 2) + __builtin_abort (); + return 0; +}