http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45810
--- Comment #20 from Dominique d'Humieres <dominiq at lps dot ens.fr> 2011-01-23 23:20:34 UTC --- > This makes hookes_law estimate to be 91 instructions, so -finline-limit=183 > should be enough. With the patch in comment #19, I rather find a threshold of -finline-limit=256. In top of that as shown by the timing below the patch increases the threshold for ac.f90 and breaks the vectorization for induct.f90. Would the patch in comment #15 and an increase of the default value for -finline-limit to 300 be acceptable at this stage (with the usual bells and whisles: SPEC, ...)? ================================================================================ Date & Time : 23 Jan 2011 23:18:23 Test Name : pbharness Compile Command : gfcp %n.f90 -Ofast -funroll-loops -ftree-loop-linear -fomit-frame-pointer -finline-limit=300 -fwhole-program -flto -o %n Benchmarks : ac aermod air capacita channel doduc fatigue gas_dyn induct linpk mdbx nf protein rnflow test_fpu tfft Maximum Times : 300.0 Target Error % : 0.200 Minimum Repeats : 2 Maximum Repeats : 5 Benchmark Compile Executable Ave Run Number Estim Name (secs) (bytes) (secs) Repeats Err % --------- ------- ---------- ------- ------- ------ ac 3.15 50536 9.58 2 0.0156 aermod 104.98 1652280 18.79 2 0.1011 air 8.83 90048 6.99 5 0.7334 capacita 5.95 89056 40.21 2 0.0174 channel 1.65 34448 2.99 2 0.0502 doduc 14.59 208056 27.91 2 0.0036 fatigue 4.80 89264 4.72 2 0.0212 gas_dyn 11.65 148176 4.66 5 0.4391 induct 11.20 205976 22.34 2 0.0672 linpk 1.59 21536 21.70 2 0.0299 mdbx 5.78 84760 12.58 2 0.0119 nf 7.60 83712 29.53 5 0.3854 protein 11.69 163760 35.18 2 0.1109 rnflow 15.23 167296 26.97 2 0.0890 test_fpu 11.33 145848 11.06 5 0.3715 tfft 1.13 22072 3.30 2 0.0607 Geometric Mean Execution Time = 12.89 seconds ================================================================================ Date & Time : 23 Jan 2011 23:54:28 Test Name : pbharness Compile Command : gfcp %n.f90 -Ofast -funroll-loops -ftree-loop-linear -fomit-frame-pointer -finline-limit=600 -fwhole-program -flto -o %n Benchmarks : ac aermod air capacita channel doduc fatigue gas_dyn induct linpk mdbx nf protein rnflow test_fpu tfft Maximum Times : 300.0 Target Error % : 0.200 Minimum Repeats : 2 Maximum Repeats : 5 Benchmark Compile Executable Ave Run Number Estim Name (secs) (bytes) (secs) Repeats Err % --------- ------- ---------- ------- ------- ------ ac 3.59 54576 8.10 2 0.0062 aermod 103.73 1558344 18.91 2 0.0238 air 10.47 89992 6.77 5 0.1563 capacita 7.47 101344 40.08 2 0.0137 channel 1.65 34448 2.97 5 0.5872 doduc 15.82 216376 27.61 2 0.0000 fatigue 5.10 89264 4.73 2 0.0000 gas_dyn 12.09 152264 4.69 5 0.6428 induct 11.10 205976 22.33 2 0.0403 linpk 1.59 21536 21.72 2 0.0368 mdbx 5.85 84760 12.58 2 0.0517 nf 11.34 108280 28.98 2 0.1087 protein 11.65 163760 35.18 3 0.1422 rnflow 17.39 183696 26.71 2 0.0243 test_fpu 11.49 145816 11.02 2 0.1226 tfft 1.43 22072 3.29 2 0.0911 Geometric Mean Execution Time = 12.70 seconds