On Fri, Dec 4, 2009 at 09:41, Toon Moene <t...@moene.org> wrote: > I wonder if Graphite can do this one (or is planned to be able to): > > Another loop optimization that suggests itself > is the following, trying to eliminate unnecessary > loop constructs: > \begin{verbatim} > SUBROUTINE SUB(A, B, C, N, M) > REAL A(N, M), B(N, M), C(N, M) > DO J = 1, M > DO I = 1, N > C(I, J) = A(I, J) + B(I, J) > ENDDO > ENDDO > END > \end{verbatim} > If we could generate code that looks like what is > written below, we would have nothing but > {\em one} loop. > \begin{verbatim} > DO K = 1, M*N > C(K, 1) = A(K, 1) + B(K, 1) > ENDDO > \end{verbatim} > In this case, this transformation can be done > because the tuple $(i,j)$ forms an induction > variable $i+n*(j-1)$ in its own right > (replaced by $k$ in the {\em collapsed} loop).
For the moment Graphite wouldn't do this kind of transform. But I think that this could be done: CLooG should generate the following code if we ask it to collapse the two loops: DO K = 1, M*N I = K mod N J = K / N C(I, J) = A(I, J) + B(I, J) ENDDO And then one would have to cleanup the scalar arithmetic to have linearized array accesses, and that should already be done by GCC in the lowering of memory accesses. Now a question for Cedric: how would you ask CLooG to generate this collapsed loop? Thanks, Sebastian