------- Comment #108 from lucier at math dot purdue dot edu 2009-08-27 01:18 ------- direct.c contains a direct FFT; I've compiled the direct and inverse fft and I ran it on arrays with 2^23 double-precision complex elements and
heine:~/programs/gcc/objdirs/bench-mainline-on-fft> /pkgs/gcc-mainline/bin/gcc -v Using built-in specs. Target: x86_64-unknown-linux-gnu Configured with: ../../mainline/configure --enable-checking=release --prefix=/pkgs/gcc-mainline --enable-languages=c,c++ -enable-stage1-languages=c,c++ Thread model: posix gcc version 4.5.0 20090803 (experimental) [trunk revision 150373] (GCC) The compile options were /pkgs/gcc-mainline/bin/gcc -save-temps -c -Wno-unused -O1 -fno-math-errno -fschedule-insns2 -fno-trapping-math -fno-strict-aliasing -fwrapv -fomit-frame-pointer -fPIC -fno-common -mieee-fp -rdynamic -shared -fschedule-insns and the same without -fschedule-insns. The runtime for direct+inverse FFT with instruction scheduling was 1.264 seconds and the time for direct+inverse FFT without -fschedule-insns was 1.444 seconds, which is a 14% speedup for that one compiler option. This is on a 2.33GHz Core 2 quad machine. I'll attach the inner loops of direct.c with and with -fschedule-insns. I haven't been able to compile the complete Gambit runtime with -fschedule-insns on either x86-64 or ppc64; I've filed PR41164 and PR41176 for those two different failures. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33928