https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116285
Andi Kleen <andi-gcc at firstfloor dot org> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |andi-gcc at firstfloor dot org --- Comment #2 from Andi Kleen <andi-gcc at firstfloor dot org> --- push_to_top_level is about 5% and seems to do a lot of list walking of different scopes. Maybe a better data structure like a vector for the scopes would help. On my skylake it appears to be primarily Frontend Bound due to large code, so you might get a slight improvement by using a profile feedback built host compiler that does hot cold code splitting. 3+% is GC so you could get some boost by increasing the GC limits to GC less often. Try playing with --param ggc-min-expand and --param ggc-min-heapsize 0.94% of the cycles are iterative_hash, so you might get another slight improvement from https://github.com/andikleen/gcc/commits/rapidhash-1 which switches the hash function to something more modern (still looking for supporting data that it actually helps) But none of this will drastically cut the time, the profile is fairly flat. # Overhead Command Source Shared Object Source Symbol > # ........ ....... .................... ...........................................................................................................................................................> # 5.11% cc1plus cc1plus [.] push_to_top_level() > 2.71% cc1plus cc1plus [.] gt_ggc_mx_lang_tree_node(void*) > 1.00% cc1plus cc1plus [.] ggc_set_mark(void const*) > 0.94% cc1plus cc1plus [.] iterative_hash > 0.73% cc1plus cc1plus [.] fields_linear_search(tree_node*, tree_node*, bool) [clone .isra.0] > 0.72% cc1plus cc1plus [.] iterative_hash_template_arg(tree_node*, unsigned int) > 0.67% cc1plus cc1plus [.] ggc_internal_alloc(unsigned long, void (*)(void*), unsigned long, unsigned long) > 0.64% cc1plus cc1plus [.] gt_ggc_mx_lang_tree_node(void*) > 0.54% cc1plus cc1plus [.] ggc_set_mark(void const*) > 0.54% cc1plus cc1plus [.] fields_linear_search(tree_node*, tree_node*, bool) [clone .isra.0] > 0.51% cc1plus cc1plus [.] fields_linear_search(tree_node*, tree_node*, bool) [clone .isra.0] >