On 2017.11.07 at 00:12 +0100, Jan Hubicka wrote:
> > On 2017.11.05 at 11:55 +0100, Jan Hubicka wrote:
> > > > On 2017.11.03 at 16:48 +0100, Jan Hubicka wrote:
> > > > > this is updated patch which I have comitted after 
> > > > > profiledbootstrapping x86-64
> > > >
> > > > Unfortunately, compiling tramp3d-v4.cpp is 6-7% slower after this patch.
> > > > This happens with an LTO/PGO bootstrapped gcc using 
> > > > --enable-checking=release.
> > >
> > > our periodic testers has also picked up the change and there is no 
> > > compile time
> > > regression reported for tramp3d.
> > > https://gcc.opensuse.org/gcc-old/c++bench-czerny/tramp3d/
> > > so I would conclude that it is regression in LTO+PGO bootstrap.  I am 
> > > fixing one checking
> > > bug that may cause it (where we mix local and global profiles) so perhaps 
> > > it will go away
> > > afterwards.
> >
> > Just to confirm: pure PGO bootstrap is fine, e.g. on Ryzen:
> > (LTO/PGO) 17.65 sec ( +- 0.68% )
> > (PGO)     15.74 sec ( +- 0.27% )
>
> Thanks.  I have comitted the patch for inlining profile update bug, so with 
> some
> luck LTO/PGO may be fine again.

It got worse, unfortunately:

Pure PGO:
 Performance counter stats for '/home/trippels/gcc_8/usr/local/bin/g++ -w 
-Ofast tramp3d-v4.cpp' (4 runs):

      16213.529306      task-clock (msec)         #    0.999 CPUs utilized      
      ( +-  0.25% )
             1,387      context-switches          #    0.086 K/sec              
      ( +-  0.17% )
                 4      cpu-migrations            #    0.000 K/sec              
      ( +- 14.80% )
           261,764      page-faults               #    0.016 M/sec              
      ( +-  0.03% )
    62,633,457,222      cycles                    #    3.863 GHz                
      ( +-  0.20% )  (83.32%)
    13,990,050,204      stalled-cycles-frontend   #   22.34% frontend cycles 
idle     ( +-  0.51% )  (83.33%)
    13,189,755,888      stalled-cycles-backend    #   21.06% backend cycles 
idle      ( +-  0.04% )  (83.31%)
    75,194,592,630      instructions              #    1.20  insn per cycle
                                                  #    0.19  stalled cycles per 
insn  ( +-  0.03% )  (83.35%)
    17,113,639,942      branches                  # 1055.516 M/sec              
      ( +-  0.02% )  (83.38%)
       634,471,544      branch-misses             #    3.71% of all branches    
      ( +-  0.07% )  (83.34%)

      16.226375499 seconds time elapsed                                         
 ( +-  0.24% )

LTO/PGO:
 Performance counter stats for '/home/trippels/gcc_8/usr/local/bin/g++ -w 
-Ofast tramp3d-v4.cpp' (4 runs):

      18622.496264      task-clock (msec)         #    0.999 CPUs utilized      
      ( +-  0.35% )
             1,592      context-switches          #    0.086 K/sec              
      ( +-  0.32% )
                 4      cpu-migrations            #    0.000 K/sec              
      ( +- 14.43% )
           261,370      page-faults               #    0.014 M/sec              
      ( +-  0.12% )
    71,849,030,564      cycles                    #    3.858 GHz                
      ( +-  0.08% )  (83.34%)
    15,987,209,604      stalled-cycles-frontend   #   22.25% frontend cycles 
idle     ( +-  0.47% )  (83.32%)
    14,336,345,458      stalled-cycles-backend    #   19.95% backend cycles 
idle      ( +-  0.05% )  (83.33%)
    87,674,608,740      instructions              #    1.22  insn per cycle
                                                  #    0.18  stalled cycles per 
insn  ( +-  0.01% )  (83.36%)
    20,610,950,144      branches                  # 1106.777 M/sec              
      ( +-  0.01% )  (83.35%)
       638,454,497      branch-misses             #    3.10% of all branches    
      ( +-  0.08% )  (83.35%)

      18.644370559 seconds time elapsed                                         
 ( +-  0.38% )

--
Markus

Reply via email to