------- Additional Comments From rguenth at tat dot physik dot uni-tuebingen dot de 2005-01-24 09:43 ------- Another one - matrix multiplication:
/* A [NxM], B [MxP] */ #define DOLOOP(N, M, P) \ void matmul ## N ## M ## P(double *res, const double *A, const double *B) \ { \ int i,j,k; \ for (k=0; k<P; ++k) \ for (i=0; i<N; ++i) { \ double s = 0.0; \ for (j=0; j<M; ++j) \ s += A[i*M+j] * B[j*P+k]; \ res[i*P+k] = s; \ } \ } DOLOOP(1, 1, 1) DOLOOP(2, 1, 2) DOLOOP(1, 2, 1) DOLOOP(2, 2, 2) DOLOOP(1, 3, 1) DOLOOP(1, 1024, 1) all up to 2x2 should be profitable to completely unroll. Be sure to unroll one-time rolling loops like for the last case. Zdeneks patch only does not unroll the DOLOOP(2, 2, 2) case at -O2. Good. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19401