https://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854

--- Comment #150 from Richard Biener <rguenth at gcc dot gnu.org> ---
For _num.i at -O2+ it's PRE / postreload GCSE via compute_transp that takes all
compile-time.  The reason is all the sbitmaps used and using them "inverted"
aka
one bitmap per BB instead of one bitmap per expr.

Sorting the expressions after bitmap index before processing doesn't help here.

Samples: 116K of event 'cycles', Event count (approx.): 132865298352            
Overhead       Samples  Command  Shared Object       Symbol                     
   7.64%          8910  cc1      cc1                 [.] find_base_term        
                                                #
   6.45%          7521  cc1      cc1                 [.]
get_ref_base_and_extent                                                #
   6.35%          7406  cc1      cc1                 [.] compute_transp        
                                                #
   2.85%          3308  cc1      cc1                 [.] bitmap_bit_p          
                                                #
   2.84%          3314  cc1      cc1                 [.] rtx_equal_for_memref_p
                                                #
   2.68%          3124  cc1      cc1                 [.] find_base_term         

it's also mostly alias analysis cost, so maybe the bitmaps are not the actual
problem but that we compute transparency for each block and each expression
even for blocks that will in the end not require it because the expr isn't
antic through it.

Reply via email to