On Wed, Feb 28, 2018 at 3:20 PM, Richard Sandiford
<richard.sandif...@linaro.org> wrote:
> GCC 6 and 7 would vectorise:
>
> void
> f (unsigned long incx, unsigned long incy,
>    float *restrict dx, float *restrict dy)
> {
>   unsigned long ix = 0, iy = 0;
>   for (unsigned long i = 0; i < 512; ++i)
>     {
>       dy[iy] += dx[ix];
>       ix += incx;
>       iy += incy;
>     }
> }
>
> without first proving that incy is nonzero.  This is a regression from
> GCC 5.  It was fixed on trunk in r223486, which versioned the loop based
> on whether incy is zero, but that's obviously too invasive to backport.
> This patch instead bails out for non-constant steps in the place that
> trunk would try a check for zeroness.
>
> I did wonder about trying to use range information to prove nonzeroness
> for SSA_NAMEs, but that would be entirely new code and didn't seem
> suitable for a release branch.
>
> Tested on aarch64-linux-gnu.  OK for GCC 7 and 6?  I'll add the testcase
> to trunk too.

Given dist == 0 isn't it enough to test either DR_STEP (dra) or DR_STEP (drb)?
That seems what trunk is doing (just look at dr_zero_step_indicator of dra).
Even when not using range-info I think that using !
tree_expr_nonzero_p (DR_STEP (dra))
is more to the point of the issue we're fixing -- that also would catch
integer_zerop (DR_STEP (dra)) which trunk handles by failing as well but your
patch wouldn't so it looks like a more "complete" fix.

Last but not least trunk and your patch guards all this by
!loop->force_vectorize.
But force_vectorize doesn't give any such guarantee that step isn't
zero so I wonder
why we deliberately choose to possibly miscompile stuff here?

Thus I'd like to see a simpler

+         if (! tree_expr_nonzero_p (DR_STEP (dra)))
+           {
+             if (dump_enabled_p ())
+               dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
+                                "step could be zero.\n");
+             return true;
+           }

if you think that works out and also the force_vectorize guard removed from the
trunk version.

OK with that change.

Thanks,
Richard.

> Richard
>
>
> 2018-02-28  Richard Sandiford  <richard.sandif...@linaro.org>
>
> gcc/
>         PR tree-optimization/84485
>         * tree-vect-data-refs.c (vect_analyze_data_ref_dependence): Return
>         true for zero dependence distances if either step is variable, and if
>         there is no metadata that guarantees correctness.
>
> gcc/testsuite/
>         PR tree-optimization/84485
>         * gcc.dg/vect/pr84485.c: New test.
>
> Index: gcc/tree-vect-data-refs.c
> ===================================================================
> --- gcc/tree-vect-data-refs.c   2017-07-27 18:08:43.779978373 +0100
> +++ gcc/tree-vect-data-refs.c   2018-02-28 14:16:36.621113244 +0000
> @@ -394,6 +394,16 @@ vect_analyze_data_ref_dependence (struct
>                 }
>             }
>
> +         if (!loop->force_vectorize
> +             && (TREE_CODE (DR_STEP (dra)) != INTEGER_CST
> +                 || TREE_CODE (DR_STEP (drb)) != INTEGER_CST))
> +           {
> +             if (dump_enabled_p ())
> +               dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
> +                                "step could be zero.\n");
> +             return true;
> +           }
> +
>           continue;
>         }
>
> Index: gcc/testsuite/gcc.dg/vect/pr84485.c
> ===================================================================
> --- /dev/null   2018-02-26 10:26:41.624847060 +0000
> +++ gcc/testsuite/gcc.dg/vect/pr84485.c 2018-02-28 14:16:36.620112862 +0000
> @@ -0,0 +1,34 @@
> +/* { dg-do run } */
> +
> +#include "tree-vect.h"
> +
> +#define N 256
> +
> +void __attribute__ ((noinline, noclone))
> +f (unsigned long incx, unsigned long incy,
> +   float *restrict dx, float *restrict dy)
> +{
> +  unsigned long ix = 0, iy = 0;
> +  for (unsigned long i = 0; i < N; ++i)
> +    {
> +      dy[iy] += dx[ix];
> +      ix += incx;
> +      iy += incy;
> +    }
> +}
> +
> +float a = 0.0;
> +float b[N];
> +
> +int
> +main (void)
> +{
> +  check_vect ();
> +
> +  for (int i = 0; i < N; ++i)
> +    b[i] = i;
> +  f (1, 0, b, &a);
> +  if (a != N * (N - 1) / 2)
> +    __builtin_abort ();
> +  return 0;
> +}

Reply via email to