http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48636
--- Comment #21 from Dominique d'Humieres <dominiq at lps dot ens.fr> 2012-10-16 17:57:52 UTC --- Before the patch in comment #20, I get -rwxr-xr-x 1 dominiq staff 73336 Oct 16 19:19 a.out* [macbook] lin/test% time gfc -fprotect-parens -Ofast -funroll-loops -ftree-loop-linear -fomit-frame-pointer --param max-inline-insns-auto=150 -fwhole-program -flto -fno-tree-loop-if-convert fatigue.f90 8.485u 0.205s 0:08.73 99.4% 0+0k 0+29io 0pf+0w [macbook] lin/test% ll a.out -rwxr-xr-x 1 dominiq staff 73336 Oct 16 19:19 a.out* [[macbook] lin/test% time a.out > /dev/null 2.916u 0.003s 0:02.92 99.6% 0+0k 0+1io 0pf+0w [macbook] lin/test% time gfc -fprotect-parens -Ofast -funroll-loops -ftree-loop-linear -fomit-frame-pointer -fwhole-program -flto -fno-tree-loop-if-convert fatigue.f90 6.822u 0.193s 0:07.06 99.2% 0+0k 0+30io 0pf+0w [macbook] lin/test% ll a.out -rwxr-xr-x 1 dominiq staff 69312 Oct 16 19:21 a.out* [macbook] lin/test% time a.out > /dev/null 4.851u 0.004s 0:04.86 99.7% 0+0k 0+1io 0pf+0w After the patch I get [macbook] lin/test% time gfc -fprotect-parens -Ofast -funroll-loops -ftree-loop-linear -fomit-frame-pointer -fwhole-program -flto -fno-tree-loop-if-convert fatigue.f90 7.277u 0.217s 0:07.52 99.4% 0+0k 0+28io 0pf+0w [macbook] lin/test% ll a.out-rwxr-xr-x 1 dominiq staff 69248 Oct 16 19:46 a.out* [macbook] lin/test% time a.out > /dev/null 2.912u 0.003s 0:02.91 100.0% 0+0k 0+2io 0pf+0w So for this particular test with the same options, after the patch the compilation time is ~6% slower, the size is about the same (actually smaller;-) and the run time ~40% faster. Without the patch and with --param max-inline-insns-auto=150 compared to with the patch without this option, the compilation time is ~20% slower, the size is ~6% larger, and the runtime is the same. Further testing coming, thanks for the patch.