https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90329
--- Comment #17 from Steve Kargl <sgk at troutmask dot apl.washington.edu> ---
On Mon, May 06, 2019 at 04:40:08PM +0000, kargl at gcc dot gnu.org wrote:
> > Since we applied the fix for PR 87689 to gcc 7, gcc 8 and gcc 9,
> > I would suggest that we make -fno-optimize-sibling-calls
> > the default on these branches. Maintaining binary compatibility
> > (even if it is bug compatibility) with existing packages is
> > something we should strive for, especially with such
> > important software packages as BLAS and LAPACK. These packages
> > are one important reason why people still use Fortran, and
> > I would hate to push them towards flang with this.
> >
> > For current trunk, I would recommend keeping the current
> > hehavior and contact the LAPACK maintainers to a) give them
> > a heads-up for this problem, and b) a year to work out
> > the problem.
> >
> > Would this be something that people could agree on?
>
> Does -fno-optimizing-sibling-calls effect performance?
> A 1% (or less) degradation may be considered negligible,
> and an acceptible compromise. 10% would be unacceptable.
>
> Guess I'll need to dust off my old copy of the Polyhedron
> Benchmarks and run a few tests.
>
So, I dusted off my old PB code and ran some tests.
The system is x86_64-*-freebsd. I saw nothing to
suggest that this option would have a negative
impact on performance.
================================================================================
Date & Time : 6 May 2019 13:29:24
Test Name : gfcx
Compile Command : gfcx -static -ffpe-summary=none -O3 -pipe -mtune=native
-march=native -ffast-math -ftree-vectorize -funroll-loops
--param max-unroll-times=4 %n.f90 -o %n
Benchmarks : ac aermod air capacita channel doduc fatigue gas_dyn induct
linpk mdbx nf protein rnflow test_fpu tfft
Maximum Times : 200.0
Target Error % : 0.100
Minimum Repeats : 5
Maximum Repeats : 10
Benchmark Compile Executable Ave Run Number Estim
Name (secs) (bytes) (secs) Repeats Err %
--------- ------- ---------- ------- ------- ------
ac 1.60 5511576 8.00 6 0.0893
aermod 49.38 6822120 19.18 5 0.0054
air 7.49 5632568 4.57 5 0.0340
capacita 4.97 5639072 40.91 5 0.0384
channel 1.31 5520904 1.86 10 0.2693
doduc 7.53 5655488 19.33 6 0.0978
fatigue 3.42 5618928 4.41 5 0.0170
gas_dyn 3.80 5604440 1.91 5 0.0214
induct 7.59 5771912 6.18 5 0.0209
linpk 1.23 5486240 7.82 5 0.0096
mdbx 2.97 5553200 7.33 5 0.0388
nf 2.73 5533744 8.57 5 0.0095
protein 4.89 5762320 25.13 5 0.0494
rnflow 7.87 5965568 34.26 5 0.0810
test_fpu 5.90 5705216 6.03 10 0.1206
tfft 1.64 5519096 1.81 5 0.0138
Geometric Mean Execution Time = 7.94 seconds
================================================================================
================================================================================
Date & Time : 6 May 2019 16:11:36
Test Name : gfcx
Compile Command : gfcx -fno-optimize-sibling-calls -static -ffpe-summary=none
-O3 -pipe -mtune=native -march=native -ffast-math -ftree-vectorize
-funroll-loops --param max-unroll-times=4 %n.f90 -o %n
Benchmarks : ac aermod air capacita channel doduc fatigue gas_dyn induct
linpk mdbx nf protein rnflow test_fpu tfft
Maximum Times : 200.0
Target Error % : 0.100
Minimum Repeats : 5
Maximum Repeats : 10
Benchmark Compile Executable Ave Run Number Estim
Name (secs) (bytes) (secs) Repeats Err %
--------- ------- ---------- ------- ------- ------
ac 1.58 5511576 7.99 5 0.0139
aermod 49.16 6822120 19.17 5 0.0216
air 7.48 5632568 4.57 5 0.0274
capacita 4.95 5639072 41.24 5 0.3593
channel 1.31 5520904 1.86 10 0.2607
doduc 8.00 5655488 19.29 10 0.0947
fatigue 3.43 5618928 4.40 5 0.0545
gas_dyn 3.81 5604440 1.97 5 0.0328
induct 7.58 5771912 6.18 5 0.0121
linpk 1.23 5486240 7.85 5 0.0699
mdbx 2.97 5553200 7.28 5 0.0703
nf 2.72 5533744 8.62 5 0.0998
protein 4.87 5762320 25.30 8 0.1194
rnflow 7.86 5965568 34.28 6 0.2875
test_fpu 5.88 5705216 6.03 8 0.0980
tfft 1.64 5519096 1.74 6 0.0841
Geometric Mean Execution Time = 7.94 seconds
================================================================================