On Wed, Feb 28, 2018 at 3:20 PM, Richard Sandiford <richard.sandif...@linaro.org> wrote: > GCC 6 and 7 would vectorise: > > void > f (unsigned long incx, unsigned long incy, > float *restrict dx, float *restrict dy) > { > unsigned long ix = 0, iy = 0; > for (unsigned long i = 0; i < 512; ++i) > { > dy[iy] += dx[ix]; > ix += incx; > iy += incy; > } > } > > without first proving that incy is nonzero. This is a regression from > GCC 5. It was fixed on trunk in r223486, which versioned the loop based > on whether incy is zero, but that's obviously too invasive to backport. > This patch instead bails out for non-constant steps in the place that > trunk would try a check for zeroness. > > I did wonder about trying to use range information to prove nonzeroness > for SSA_NAMEs, but that would be entirely new code and didn't seem > suitable for a release branch. > > Tested on aarch64-linux-gnu. OK for GCC 7 and 6? I'll add the testcase > to trunk too.
Given dist == 0 isn't it enough to test either DR_STEP (dra) or DR_STEP (drb)? That seems what trunk is doing (just look at dr_zero_step_indicator of dra). Even when not using range-info I think that using ! tree_expr_nonzero_p (DR_STEP (dra)) is more to the point of the issue we're fixing -- that also would catch integer_zerop (DR_STEP (dra)) which trunk handles by failing as well but your patch wouldn't so it looks like a more "complete" fix. Last but not least trunk and your patch guards all this by !loop->force_vectorize. But force_vectorize doesn't give any such guarantee that step isn't zero so I wonder why we deliberately choose to possibly miscompile stuff here? Thus I'd like to see a simpler + if (! tree_expr_nonzero_p (DR_STEP (dra))) + { + if (dump_enabled_p ()) + dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, + "step could be zero.\n"); + return true; + } if you think that works out and also the force_vectorize guard removed from the trunk version. OK with that change. Thanks, Richard. > Richard > > > 2018-02-28 Richard Sandiford <richard.sandif...@linaro.org> > > gcc/ > PR tree-optimization/84485 > * tree-vect-data-refs.c (vect_analyze_data_ref_dependence): Return > true for zero dependence distances if either step is variable, and if > there is no metadata that guarantees correctness. > > gcc/testsuite/ > PR tree-optimization/84485 > * gcc.dg/vect/pr84485.c: New test. > > Index: gcc/tree-vect-data-refs.c > =================================================================== > --- gcc/tree-vect-data-refs.c 2017-07-27 18:08:43.779978373 +0100 > +++ gcc/tree-vect-data-refs.c 2018-02-28 14:16:36.621113244 +0000 > @@ -394,6 +394,16 @@ vect_analyze_data_ref_dependence (struct > } > } > > + if (!loop->force_vectorize > + && (TREE_CODE (DR_STEP (dra)) != INTEGER_CST > + || TREE_CODE (DR_STEP (drb)) != INTEGER_CST)) > + { > + if (dump_enabled_p ()) > + dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, > + "step could be zero.\n"); > + return true; > + } > + > continue; > } > > Index: gcc/testsuite/gcc.dg/vect/pr84485.c > =================================================================== > --- /dev/null 2018-02-26 10:26:41.624847060 +0000 > +++ gcc/testsuite/gcc.dg/vect/pr84485.c 2018-02-28 14:16:36.620112862 +0000 > @@ -0,0 +1,34 @@ > +/* { dg-do run } */ > + > +#include "tree-vect.h" > + > +#define N 256 > + > +void __attribute__ ((noinline, noclone)) > +f (unsigned long incx, unsigned long incy, > + float *restrict dx, float *restrict dy) > +{ > + unsigned long ix = 0, iy = 0; > + for (unsigned long i = 0; i < N; ++i) > + { > + dy[iy] += dx[ix]; > + ix += incx; > + iy += incy; > + } > +} > + > +float a = 0.0; > +float b[N]; > + > +int > +main (void) > +{ > + check_vect (); > + > + for (int i = 0; i < N; ++i) > + b[i] = i; > + f (1, 0, b, &a); > + if (a != N * (N - 1) / 2) > + __builtin_abort (); > + return 0; > +}