On 2017.05.25 at 11:55 +0200, Martin Liška wrote:
> Hi.
>
> As I spoke about the PGO with Honza and Richi, current 3-stage is not ideal 
> for following
> 2 reasons:
>
> 1) stageprofile compiler is train just on libraries that are built during 
> stage2
> 2) apart from that, as the compiler is also used to build the final compiler, 
> profile
> is being updated during the build. So the stage2 compiler is making different 
> decisions.
>
> Both problems can be resolved by adding another step in between current 
> stage2 and stage3
> where we train stage2 compiler by building compiler with default options.
>
> I'm going to do some measurements.

I did some measurements on gcc67 (trunk with --enable-checking=release).
The apparent speedup is in the noise.

Without your patch:

 Performance counter stats for 'g++ -w -Ofast tramp3d-v4.cpp' (10 runs):

      15749.058451      task-clock (msec)         #    0.997 CPUs utilized      
      ( +-  0.13% )
             1,352      context-switches          #    0.086 K/sec              
      ( +-  0.16% )
                 7      cpu-migrations            #    0.000 K/sec              
      ( +-  5.73% )
           269,142      page-faults               #    0.017 M/sec              
      ( +-  0.01% )
    60,676,581,181      cycles                    #    3.853 GHz                
      ( +-  0.09% )  (83.35%)
    13,401,784,189      stalled-cycles-frontend   #   22.09% frontend cycles 
idle     ( +-  0.20% )  (83.33%)
    12,926,843,370      stalled-cycles-backend    #   21.30% backend cycles 
idle      ( +-  0.04% )  (83.31%)
    73,074,099,356      instructions              #    1.20  insn per cycle
                                                  #    0.18  stalled cycles per 
insn  ( +-  0.02% )  (83.34%)
    16,607,220,814      branches                  # 1054.490 M/sec              
      ( +-  0.03% )  (83.36%)
       616,673,310      branch-misses             #    3.71% of all branches    
      ( +-  0.08% )  (83.36%)

      15.803602619 seconds time elapsed                                         
 ( +-  0.14% )

With your patch:

 Performance counter stats for 'g++ -w -Ofast tramp3d-v4.cpp' (10 runs):

      15735.220610      task-clock (msec)         #    0.997 CPUs utilized      
      ( +-  0.11% )
             1,354      context-switches          #    0.086 K/sec              
      ( +-  0.22% )
                 6      cpu-migrations            #    0.000 K/sec              
      ( +-  6.67% )
           269,164      page-faults               #    0.017 M/sec              
      ( +-  0.01% )
    60,723,862,242      cycles                    #    3.859 GHz                
      ( +-  0.08% )  (83.35%)
    13,382,554,421      stalled-cycles-frontend   #   22.04% frontend cycles 
idle     ( +-  0.14% )  (83.31%)
    12,912,171,664      stalled-cycles-backend    #   21.26% backend cycles 
idle      ( +-  0.03% )  (83.34%)
    73,109,081,227      instructions              #    1.20  insn per cycle
                                                  #    0.18  stalled cycles per 
insn  ( +-  0.03% )  (83.34%)
    16,590,421,798      branches                  # 1054.349 M/sec              
      ( +-  0.02% )  (83.35%)
       616,669,135      branch-misses             #    3.72% of all branches    
      ( +-  0.08% )  (83.36%)

      15.788772466 seconds time elapsed                                         
 ( +-  0.12% )



--
Markus

Reply via email to