https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90329
--- Comment #17 from Steve Kargl <sgk at troutmask dot apl.washington.edu> --- On Mon, May 06, 2019 at 04:40:08PM +0000, kargl at gcc dot gnu.org wrote: > > Since we applied the fix for PR 87689 to gcc 7, gcc 8 and gcc 9, > > I would suggest that we make -fno-optimize-sibling-calls > > the default on these branches. Maintaining binary compatibility > > (even if it is bug compatibility) with existing packages is > > something we should strive for, especially with such > > important software packages as BLAS and LAPACK. These packages > > are one important reason why people still use Fortran, and > > I would hate to push them towards flang with this. > > > > For current trunk, I would recommend keeping the current > > hehavior and contact the LAPACK maintainers to a) give them > > a heads-up for this problem, and b) a year to work out > > the problem. > > > > Would this be something that people could agree on? > > Does -fno-optimizing-sibling-calls effect performance? > A 1% (or less) degradation may be considered negligible, > and an acceptible compromise. 10% would be unacceptable. > > Guess I'll need to dust off my old copy of the Polyhedron > Benchmarks and run a few tests. > So, I dusted off my old PB code and ran some tests. The system is x86_64-*-freebsd. I saw nothing to suggest that this option would have a negative impact on performance. ================================================================================ Date & Time : 6 May 2019 13:29:24 Test Name : gfcx Compile Command : gfcx -static -ffpe-summary=none -O3 -pipe -mtune=native -march=native -ffast-math -ftree-vectorize -funroll-loops --param max-unroll-times=4 %n.f90 -o %n Benchmarks : ac aermod air capacita channel doduc fatigue gas_dyn induct linpk mdbx nf protein rnflow test_fpu tfft Maximum Times : 200.0 Target Error % : 0.100 Minimum Repeats : 5 Maximum Repeats : 10 Benchmark Compile Executable Ave Run Number Estim Name (secs) (bytes) (secs) Repeats Err % --------- ------- ---------- ------- ------- ------ ac 1.60 5511576 8.00 6 0.0893 aermod 49.38 6822120 19.18 5 0.0054 air 7.49 5632568 4.57 5 0.0340 capacita 4.97 5639072 40.91 5 0.0384 channel 1.31 5520904 1.86 10 0.2693 doduc 7.53 5655488 19.33 6 0.0978 fatigue 3.42 5618928 4.41 5 0.0170 gas_dyn 3.80 5604440 1.91 5 0.0214 induct 7.59 5771912 6.18 5 0.0209 linpk 1.23 5486240 7.82 5 0.0096 mdbx 2.97 5553200 7.33 5 0.0388 nf 2.73 5533744 8.57 5 0.0095 protein 4.89 5762320 25.13 5 0.0494 rnflow 7.87 5965568 34.26 5 0.0810 test_fpu 5.90 5705216 6.03 10 0.1206 tfft 1.64 5519096 1.81 5 0.0138 Geometric Mean Execution Time = 7.94 seconds ================================================================================ ================================================================================ Date & Time : 6 May 2019 16:11:36 Test Name : gfcx Compile Command : gfcx -fno-optimize-sibling-calls -static -ffpe-summary=none -O3 -pipe -mtune=native -march=native -ffast-math -ftree-vectorize -funroll-loops --param max-unroll-times=4 %n.f90 -o %n Benchmarks : ac aermod air capacita channel doduc fatigue gas_dyn induct linpk mdbx nf protein rnflow test_fpu tfft Maximum Times : 200.0 Target Error % : 0.100 Minimum Repeats : 5 Maximum Repeats : 10 Benchmark Compile Executable Ave Run Number Estim Name (secs) (bytes) (secs) Repeats Err % --------- ------- ---------- ------- ------- ------ ac 1.58 5511576 7.99 5 0.0139 aermod 49.16 6822120 19.17 5 0.0216 air 7.48 5632568 4.57 5 0.0274 capacita 4.95 5639072 41.24 5 0.3593 channel 1.31 5520904 1.86 10 0.2607 doduc 8.00 5655488 19.29 10 0.0947 fatigue 3.43 5618928 4.40 5 0.0545 gas_dyn 3.81 5604440 1.97 5 0.0328 induct 7.58 5771912 6.18 5 0.0121 linpk 1.23 5486240 7.85 5 0.0699 mdbx 2.97 5553200 7.28 5 0.0703 nf 2.72 5533744 8.62 5 0.0998 protein 4.87 5762320 25.30 8 0.1194 rnflow 7.86 5965568 34.28 6 0.2875 test_fpu 5.88 5705216 6.03 8 0.0980 tfft 1.64 5519096 1.74 6 0.0841 Geometric Mean Execution Time = 7.94 seconds ================================================================================