https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90329

--- Comment #17 from Steve Kargl <sgk at troutmask dot apl.washington.edu> ---
On Mon, May 06, 2019 at 04:40:08PM +0000, kargl at gcc dot gnu.org wrote:
> > Since we applied the fix for PR 87689 to gcc 7, gcc 8 and gcc 9,
> > I would suggest that we make -fno-optimize-sibling-calls
> > the default on these branches.  Maintaining binary compatibility
> > (even if it is bug compatibility) with existing packages is
> > something we should strive for, especially with such
> > important software packages as BLAS and LAPACK. These packages
> > are one important reason why people still use Fortran, and
> > I would hate to push them towards flang with this.
> > 
> > For current trunk, I would recommend keeping the current
> > hehavior and contact the LAPACK maintainers to a) give them
> > a heads-up for this problem, and b) a year to work out
> > the problem.
> > 
> > Would this be something that people could agree on?
> 
> Does -fno-optimizing-sibling-calls effect performance?
> A 1% (or less) degradation may be considered negligible,
> and an acceptible compromise. 10% would be unacceptable.
> 
> Guess I'll need to dust off my old copy of the Polyhedron
> Benchmarks and run a few tests.
> 

So, I dusted off my old PB code and ran some tests.
The system is x86_64-*-freebsd.  I saw nothing to
suggest that this option would have a negative 
impact on performance.

================================================================================
Date & Time     :  6 May 2019 13:29:24
Test Name       : gfcx
Compile Command : gfcx -static -ffpe-summary=none -O3 -pipe -mtune=native
   -march=native -ffast-math -ftree-vectorize -funroll-loops
   --param max-unroll-times=4 %n.f90 -o %n
Benchmarks      : ac aermod air capacita channel doduc fatigue gas_dyn induct
    linpk mdbx nf protein rnflow test_fpu tfft
Maximum Times   :      200.0
Target Error %  :      0.100
Minimum Repeats :     5
Maximum Repeats :    10

   Benchmark   Compile  Executable   Ave Run  Number   Estim
        Name    (secs)     (bytes)    (secs) Repeats   Err %
   ---------   -------  ----------   ------- -------  ------
          ac      1.60     5511576      8.00       6  0.0893
      aermod     49.38     6822120     19.18       5  0.0054
         air      7.49     5632568      4.57       5  0.0340
    capacita      4.97     5639072     40.91       5  0.0384
     channel      1.31     5520904      1.86      10  0.2693
       doduc      7.53     5655488     19.33       6  0.0978
     fatigue      3.42     5618928      4.41       5  0.0170
     gas_dyn      3.80     5604440      1.91       5  0.0214
      induct      7.59     5771912      6.18       5  0.0209
       linpk      1.23     5486240      7.82       5  0.0096
        mdbx      2.97     5553200      7.33       5  0.0388
          nf      2.73     5533744      8.57       5  0.0095
     protein      4.89     5762320     25.13       5  0.0494
      rnflow      7.87     5965568     34.26       5  0.0810
    test_fpu      5.90     5705216      6.03      10  0.1206
        tfft      1.64     5519096      1.81       5  0.0138

Geometric Mean Execution Time =       7.94 seconds

================================================================================

================================================================================
Date & Time     :  6 May 2019 16:11:36
Test Name       : gfcx
Compile Command : gfcx -fno-optimize-sibling-calls -static -ffpe-summary=none
    -O3 -pipe -mtune=native -march=native -ffast-math -ftree-vectorize
    -funroll-loops --param max-unroll-times=4 %n.f90 -o %n
Benchmarks      : ac aermod air capacita channel doduc fatigue gas_dyn induct
    linpk mdbx nf protein rnflow test_fpu tfft
Maximum Times   :      200.0
Target Error %  :      0.100
Minimum Repeats :     5
Maximum Repeats :    10

   Benchmark   Compile  Executable   Ave Run  Number   Estim
        Name    (secs)     (bytes)    (secs) Repeats   Err %
   ---------   -------  ----------   ------- -------  ------
          ac      1.58     5511576      7.99       5  0.0139
      aermod     49.16     6822120     19.17       5  0.0216
         air      7.48     5632568      4.57       5  0.0274
    capacita      4.95     5639072     41.24       5  0.3593
     channel      1.31     5520904      1.86      10  0.2607
       doduc      8.00     5655488     19.29      10  0.0947
     fatigue      3.43     5618928      4.40       5  0.0545
     gas_dyn      3.81     5604440      1.97       5  0.0328
      induct      7.58     5771912      6.18       5  0.0121
       linpk      1.23     5486240      7.85       5  0.0699
        mdbx      2.97     5553200      7.28       5  0.0703
          nf      2.72     5533744      8.62       5  0.0998
     protein      4.87     5762320     25.30       8  0.1194
      rnflow      7.86     5965568     34.28       6  0.2875
    test_fpu      5.88     5705216      6.03       8  0.0980
        tfft      1.64     5519096      1.74       6  0.0841

Geometric Mean Execution Time =       7.94 seconds

================================================================================

Reply via email to