On Mon, Oct 15, 2018 at 11:45 AM Jakub Jelinek <[email protected]> wrote:
>
> On Mon, Oct 15, 2018 at 11:30:56AM +0200, Richard Biener wrote:
> > But isn't _actual_ collapsing an implementation detail?
>
> No, it is required by the standard and in many cases it is very much
> observable.
> #pragma omp parallel for schedule(nonmonotonic: static, 23) collapse (2)
> for (int i = 0; i < 64; i++)
> for (int j = 0; j < 16; j++)
> a[i][j] = omp_get_thread_num ();
> The standard says that from the logical iteration space 64 x 16,
> first 23 iterations go to the first thread (i.e. i=0, j=0..15 and i=1,
> j=0..14), then 23 iterations go to the second thread, etc.
> In other constructs, e.g. the new loop construct, it is a request to
> distribute, parallelize and vectorize as much as possible with optional
> guarantee of no cross-iteration dependencies at all, but even in that case
> using the source loops might not be always a win, e.g. the loopnest could be
> 5 loops and the iteration space might be diagonal or other not exactly
> rectangular.
But then you could do
for (int i = si1; i < n1; i++)
for (int j = sj1; j < m1; j++)
{
a[i][j] = omp_get_thread_num ();
}
if (m_tail1)
for (int j = 0; j < m_tail1; j++)
...
with appropriate start/end for the i/j loop and the "epilogue" loop?
> > That is, can we delay the actual collapsing until after vectorization
> > for example?
>
> No. We can come up with some way to propagate some of the original info to
> the vectorizer if it helps (or teach vectorizer to recognize whatever we
> produce), but the mandatory transformation needs to be done
> immediately before optimizations make those impossible.
The issue is that with refs like
a[i % m] = a[(i + 1) % m];
you do not know whether you have a backwards or forward dependence
so I do not see how you could perform loop vectorization. That implies
that one option might be to have the OMP lowering unroll & interleave
loops when asked for SIMD so that the SLP vectorizer could pick up
things?
But then how is safelen() defined in the context of collapse()?
Richard.
> Jakub