Hi, I just posted on the wiki (http://gcc.gnu.org/wiki/SwingModuloScheduling) a list of items to improve the GCC modulo scheduler (SMS). We've been looking into this on and off in the past year, while trying to tune it for ppc970 and then for the Cell. With relatively small tweaks, SMS is starting to show rather nice impact on the Cell SPU; For example, on a simple summation program from the testsuite - vect-widen-mult-sum.c:
int main1 (short *in, int off, short scale, int n) { int i; int sum = 0; for (i = 0; i < n; i++) { sum += ((int) in[i] * (int) in[i+off]) >> scale; } return sum; } Compiling the above for the Cell SPU (with a few local patches we have, to be submitted to mainline - see wiki for details), SMS brings over 40% improvement (when unrolling is not enabled; SMS doesn't always improve more than unrolling does, and at present, SMS does not work on unrolled loops - one of the items on the wiki list...): -O3 runtime: 880 -O3 sms runtime: 482 -O3 unroll runtime: 312 -O3 -ftree-vectorize runtime: 150 -O3 -ftree-vectorize sms runtime: 86 -O3 -ftree-vectorize unroll runtime: 96 where: unroll = -funroll-loops -fvariable-expansion-in-unroller sms = -fmodulo-sched The list does not include all possible improvements to SMS - people are welcome to edit the page and add additional items (probably the Itanium people have a few ideas in the pipe?: http://gcc.gnu.org/ml/gcc/2006-11/msg00361.html: > We also plan to fix swing modulo scheduling to make it work on ia64 > and improve it by propagating data dependency information to RTL. We > plan to discuss this project on the GCC mailing list in a few weeks.) Vladimir & Dorit