https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89115
Richard Biener <rguenth at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |ASSIGNED Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot gnu.org --- Comment #4 from Richard Biener <rguenth at gcc dot gnu.org> --- The DSE thing is (of course) alias queries and there, find_base_term. 200 000 calls to check_mem_read_use result in 25 600 000 calls to canon_true_dependence. I suppose we could cache the result of find_base_term and have a canon_true_dependence_with_bases. Eventually DSE should just give up with too long next_local_store chains. Btw, the plus_constant calls all originate from true_dependence_1 ending up calling get_addr and that re-building RTL that must be there already somehow. Thus in the end it originates from the excessive number of alias queries done by DSE. Even that get_addr() part could be cached though. canon_true_dependence_with_bases_and_addrs. Unfortunately --param max-dse-active-local-stores is a bail-out thing so we need to cross a magic barrier which is somewhere between 1000 and 1250: > /usr/bin/time /abuild/rguenther/obj/gcc/cc1 -quiet tagCircle49h12.i -O > --param max-dse-active-local-stores=5000 (default) 18.59user 1.03system 0:19.62elapsed 99%CPU (0avgtext+0avgdata 3787968maxresident)k 0inputs+9808outputs (0major+933497minor)pagefaults 0swaps > /usr/bin/time /abuild/rguenther/obj/gcc/cc1 -quiet tagCircle49h12.i -O > --param max-dse-active-local-stores=2500 18.13user 1.09system 0:19.22elapsed 99%CPU (0avgtext+0avgdata 3787792maxresident)k 0inputs+9904outputs (0major+934009minor)pagefaults 0swaps > /usr/bin/time /abuild/rguenther/obj/gcc/cc1 -quiet tagCircle49h12.i -O > --param max-dse-active-local-stores=2000 18.57user 0.98system 0:19.56elapsed 99%CPU (0avgtext+0avgdata 3786852maxresident)k 0inputs+9808outputs (0major+933789minor)pagefaults 0swaps > /usr/bin/time /abuild/rguenther/obj/gcc/cc1 -quiet tagCircle49h12.i -O > --param max-dse-active-local-stores=1500 18.71user 1.01system 0:19.74elapsed 99%CPU (0avgtext+0avgdata 3789372maxresident)k 0inputs+9808outputs (0major+933920minor)pagefaults 0swaps > /usr/bin/time /abuild/rguenther/obj/gcc/cc1 -quiet tagCircle49h12.i -O > --param max-dse-active-local-stores=1250 18.54user 0.94system 0:19.49elapsed 99%CPU (0avgtext+0avgdata 3788452maxresident)k 0inputs+9808outputs (0major+933435minor)pagefaults 0swaps > /usr/bin/time /abuild/rguenther/obj/gcc/cc1 -quiet tagCircle49h12.i -O > --param max-dse-active-local-stores=1000 7.63user 0.22system 0:07.86elapsed 99%CPU (0avgtext+0avgdata 715704maxresident)k 0inputs+9808outputs (0major+170563minor)pagefaults 0swaps > /usr/bin/time /abuild/rguenther/obj/gcc/cc1 -quiet tagCircle49h12.i -O > --param max-dse-active-local-stores=500 7.66user 0.24system 0:07.90elapsed 100%CPU (0avgtext+0avgdata 717116maxresident)k > /usr/bin/time /abuild/rguenther/obj/gcc/cc1 -quiet tagCircle49h12.i -O > --param max-dse-active-local-stores=250 7.73user 0.16system 0:07.90elapsed 100%CPU (0avgtext+0avgdata 715960maxresident)k 0inputs+9904outputs (0major+170918minor)pagefaults 0swaps I am testing Index: gcc/opts.c =================================================================== --- gcc/opts.c (revision 268383) +++ gcc/opts.c (working copy) @@ -670,7 +670,16 @@ default_options_optimization (struct gcc /* For -O1 only do loop invariant motion for very small loops. */ maybe_set_param_value (PARAM_LOOP_INVARIANT_MAX_BBS_IN_LOOP, - opt2 ? default_param_value (PARAM_LOOP_INVARIANT_MAX_BBS_IN_LOOP) : 1000, + opt2 ? default_param_value (PARAM_LOOP_INVARIANT_MAX_BBS_IN_LOOP) + : default_param_value (PARAM_LOOP_INVARIANT_MAX_BBS_IN_LOOP) / 10, + opts->x_param_values, opts_set->x_param_values); + + /* For -O1 reduce the maximum number of active local stores for RTL DSE + since this can consume huge amounts of memory (PR89115). */ + maybe_set_param_value + (PARAM_MAX_DSE_ACTIVE_LOCAL_STORES, + opt2 ? default_param_value (PARAM_MAX_DSE_ACTIVE_LOCAL_STORES) + : default_param_value (PARAM_MAX_DSE_ACTIVE_LOCAL_STORES) / 10, opts->x_param_values, opts_set->x_param_values); /* At -Ofast, allow store motion to introduce potential race conditions. */