or linearization?

Sebastian Pop Mon, 15 Oct 2018 13:23:31 -0700

On Fri, Oct 12, 2018 at 2:14 PM Marc Glisse <[email protected]> wrote:


> On Fri, 12 Oct 2018, Thomas Schwinge wrote:
>
> > Hmm, and without any OpenACC/OpenMP etc., actually the same problem is
> > also present when running the following code through the vectorizer:
> >
> >    for (int tmp = 0; tmp < N_J * N_I; ++tmp)
> >      {
> >        int j = tmp / N_I;
> >        int i = tmp % N_I;
> >        a[j][i] = 0;
> >      }
> >
> > ... whereas the following variant (obviously) does vectorize:
> >
> >    int a[NJ * NI];
> >
> >    for (int tmp = 0; tmp < N_J * N_I; ++tmp)
> >      a[tmp] = 0;
>
> I had a quick look at the difference, and a[j][i] remains in this form
> throughout optimization. If I write instead *((*(a+j))+i) = 0; I get
>
>    j_10 = tmp_17 / 1025;
>    i_11 = tmp_17 % 1025;
>    _1 = (long unsigned int) j_10;
>    _2 = _1 * 1025;
>    _3 = (sizetype) i_11;
>    _4 = _2 + _3;
>
> or for a power of 2
>
>    j_10 = tmp_17 >> 10;
>    i_11 = tmp_17 & 1023;
>    _1 = (long unsigned int) j_10;
>    _2 = _1 * 1024;
>    _3 = (sizetype) i_11;
>    _4 = _2 + _3;
>
> and in both cases we fail to notice that _4 = (sizetype) tmp_17; (at least
> I think that's true).
>
>
If this folding is correct, the dependence analysis would not have
to handle array accesses with div and mod, and it would be able
to classify the loop as parallel which will enable vectorization.


> So there are missing match.pd transformations in addition to whatever
> scev/ivdep/other work is needed.
>
> --
> Marc Glisse
>

Re: Can support TRUNC_DIV_EXPR, TRUNC_MOD_EXPR in GCC vectorization/scalar evolution -- and/or linearization?

Reply via email to