https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98863

--- Comment #45 from Richard Biener <rguenth at gcc dot gnu.org> ---
perf profile from non-bootstrapped, release checking enabled lto1 for ltrans34
on a 3900X (so plenty of L3):

Samples: 1M of event 'cycles:u', Event count (approx.): 1572289976832           
Overhead       Samples  Command      Shared Object     Symbol                   
   7.10%         96925  lto1-ltrans  lto1              [.] bitmap_and_into     
                             #
   6.37%         87172  lto1-ltrans  lto1              [.]
bitmap_list_insert_element_after                  #
   5.84%         80125  lto1-ltrans  lto1              [.] bitmap_set_bit      
                             #
   5.71%         78151  lto1-ltrans  lto1              [.] bitmap_ior_into     
                             #
   5.70%         78041  lto1-ltrans  lto1              [.] bitmap_bit_p        
                             #
   3.71%         50632  lto1-ltrans  lto1              [.] bitmap_and          
                             #
   3.48%         47914  lto1-ltrans  lto1              [.] df_count_refs       
                             #
   2.87%         39504  lto1-ltrans  lto1              [.]
lra_create_live_ranges_1                          #
   2.45%         33656  lto1-ltrans  lto1              [.] bitmap_elt_ior      
                             #
   2.39%         32794  lto1-ltrans  lto1              [.]
pre_and_rev_post_order_compute_fn                 #
   2.34%         32200  lto1-ltrans  lto1              [.] update_pseudo_point 
                             #
   2.03%         27707  lto1-ltrans  lto1              [.] bitmap_ior_and_compl
                             #
   1.79%         24445  lto1-ltrans  lto1              [.]
bitmap_and_compl_into                             #
   1.52%         20804  lto1-ltrans  lto1              [.] df_worklist_dataflow
                             #
   1.45%         19838  lto1-ltrans  lto1              [.] bitmap_copy         
                             #
   1.35%         18690  lto1-ltrans  lto1              [.]
get_immediate_dominator                           #
   1.32%         18127  lto1-ltrans  lto1              [.] update_ssa          
                             #
   1.26%         17562  lto1-ltrans  lto1              [.]
determine_value_range                             #
   1.09%         14873  lto1-ltrans  lto1              [.]
rewrite_update_dom_walker::before_dom_children    #
   1.08%         14777  lto1-ltrans  lto1              [.] compute_idf         
                             #
   1.05%         14349  lto1-ltrans  lto1              [.]
compute_dominance_frontiers                       #
   0.94%         12956  lto1-ltrans  libc-2.26.so      [.]
__memset_avx2_unaligned_erms                      #

at some point I wondered why keeping DF_LIVE around pays off, but we don't
have an easy knob to disable it at -O2.  Not many passes need LIVE, most
do with LR.

Reply via email to