or linearization?

Marc Glisse Fri, 12 Oct 2018 12:14:41 -0700

On Fri, 12 Oct 2018, Thomas Schwinge wrote:

Hmm, and without any OpenACC/OpenMP etc., actually the same problem is
also present when running the following code through the vectorizer:


   for (int tmp = 0; tmp < N_J * N_I; ++tmp)
     {
       int j = tmp / N_I;
       int i = tmp % N_I;
       a[j][i] = 0;
     }

... whereas the following variant (obviously) does vectorize:

   int a[NJ * NI];

   for (int tmp = 0; tmp < N_J * N_I; ++tmp)
     a[tmp] = 0;

I had a quick look at the difference, and a[j][i] remains in this formthroughout optimization. If I write instead *((*(a+j))+i) = 0; I get


  j_10 = tmp_17 / 1025;
  i_11 = tmp_17 % 1025;
  _1 = (long unsigned int) j_10;
  _2 = _1 * 1025;
  _3 = (sizetype) i_11;
  _4 = _2 + _3;

or for a power of 2

  j_10 = tmp_17 >> 10;
  i_11 = tmp_17 & 1023;
  _1 = (long unsigned int) j_10;
  _2 = _1 * 1024;
  _3 = (sizetype) i_11;
  _4 = _2 + _3;

and in both cases we fail to notice that _4 = (sizetype) tmp_17; (at leastI think that's true).

So there are missing match.pd transformations in addition to whateverscev/ivdep/other work is needed.


--
Marc Glisse

Re: Can support TRUNC_DIV_EXPR, TRUNC_MOD_EXPR in GCC vectorization/scalar evolution -- and/or linearization?

Reply via email to