https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114855

--- Comment #50 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Richard Biener from comment #4)
> Trunk at -O1:
> 
> dominator optimization             : 495.14 ( 82%)   0.20 (  5%) 495.44 (
> 81%)   113M (  5%)

Compared to that we're now at the following state with -O1 (everything >= 4%):

 callgraph ipa passes               :  17.23 ( 10%)
 df live regs                       :   6.76 (  4%)
 dominator optimization             :  89.76 ( 50%)
 backwards jump threading           :   7.94 (  4%)
 TOTAL                              : 180.77

So it's still DOM aka forward threading eating most of the time. 
-fno-thread-jumps improves compile-time to 77s, DOM then still takes 25s (33%)
(top offenders are then dom_oracle::register_transitives, bitmap_set_bit
and wide_int_storage copying).  I noticed the unbound dominator traversal
in register_transitives already.

With -O2 we're still running into the backwards threader slowness.  I don't
see a quick way to fix that without also eventually changing what is threaded
and what is not as side-effect of changing thread materialization order.  So I
think a bigger refactoring like Aldy started is necessary.  Eventually I'll
re-investigate a "quick" fix, but at least being able to record additional
meta per thread path is necessary (so 0001 of Aldys proposed series in it's
current or in slightly altered form).

Reply via email to