I have been experimenting with the graphite optimizer, based on GCC trunk, and cloog-isl. I started with the attached simple "C" program, which has this basic structure.
#define N 20000 int a[N][N], b[N], c[N]; [...] for (i = 0; i < N; i++) { b[i] = i; c[i] = i + N; } for (i = 0; i < N; i++) for (j = 0; j < N; j++) a[j][i] = b[i] + c[j]; (Attached, is the full test case.) And compiled it with: -O3 -floop-block. Couple of questions: 1) What option should I supply to confirm that the graphite optimizer ran and determine (i) did it in fact perform any optimizations, and (ii) which optimizations did it perform? 2) If -floop-block couldn't optimize this program, what is the likely reason? 3) Would you please offer pointers to example "C" programs that highlight graphite-cloog-isl optimizations? Thanks, - Gary
#include <stdio.h> #include <stdlib.h> #include <time.h> #define N 20000 int a[N][N], b[N], c[N]; static double cpu_time () { struct timespec ts; double t; if (clock_gettime (CLOCK_MONOTONIC, &ts)) abort (); t = ts.tv_sec + (ts.tv_nsec * 1.0e-9); return t; } int main (void) { int i, j, k; double start, stop, elapsed; for (i = 0; i < N; i++) { b[i] = i; c[i] = i + N; } start = cpu_time (); for (i = 0; i < N; i++) for (j = 0; j < N; j++) a[j][i] = b[i] + c[j]; stop = cpu_time (); elapsed = stop - start; printf ("elapsed time = %0.2f secs.\n", elapsed); return 0; }