https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116285

Andi Kleen <andi-gcc at firstfloor dot org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |andi-gcc at firstfloor dot org

--- Comment #2 from Andi Kleen <andi-gcc at firstfloor dot org> ---
push_to_top_level is about 5% and seems to do a lot of list walking of
different scopes. Maybe a better data structure like a vector for the scopes
would help.

On my skylake it appears to be primarily Frontend Bound due to large code, so
you might get a slight improvement by using a profile feedback built host
compiler that does hot cold code splitting.

3+% is GC so you could get some boost by increasing the GC limits to GC less
often.   Try playing with --param ggc-min-expand and --param ggc-min-heapsize

0.94% of the cycles are iterative_hash, so you might get another slight
improvement from  https://github.com/andikleen/gcc/commits/rapidhash-1
which switches the hash function to something more modern
(still looking for supporting data that it actually helps)

But none of this will drastically cut the time, the profile is fairly flat.

# Overhead  Command  Source Shared Object  Source Symbol                       
                                                                               
                                      >
# ........  .......  .................... 
...........................................................................................................................................................>
#
     5.11%  cc1plus  cc1plus               [.] push_to_top_level()             
                                                                               
                                      >     2.71%  cc1plus  cc1plus            
  [.] gt_ggc_mx_lang_tree_node(void*)                                          
                                                                             > 
   1.00%  cc1plus  cc1plus               [.] ggc_set_mark(void const*)         
                                                                               
                                    >     0.94%  cc1plus  cc1plus              
[.] iterative_hash                                                             
                                                                           >   
 0.73%  cc1plus  cc1plus               [.] fields_linear_search(tree_node*,
tree_node*, bool) [clone .isra.0]                                              
                                      >     0.72%  cc1plus  cc1plus            
  [.] iterative_hash_template_arg(tree_node*, unsigned int)                    
                                                                             > 
   0.67%  cc1plus  cc1plus               [.] ggc_internal_alloc(unsigned long,
void (*)(void*), unsigned long, unsigned long)                                 
                                     >     0.64%  cc1plus  cc1plus             
 [.] gt_ggc_mx_lang_tree_node(void*)                                           
                                                                            >  
  0.54%  cc1plus  cc1plus               [.] ggc_set_mark(void const*)          
                                                                               
                                   >     0.54%  cc1plus  cc1plus              
[.] fields_linear_search(tree_node*, tree_node*, bool) [clone .isra.0]         
                                                                           >   
 0.51%  cc1plus  cc1plus               [.] fields_linear_search(tree_node*,
tree_node*, bool) [clone .isra.0]                                              
                                      >

Reply via email to