This patch avoids sched-deps.cc:find_inc() creating exponential number of dependencies, which become memory and compilation time hogs. Consider example (simplified from PR96388) ... === sp=sp-4 // sp_insnA mem_insnA1[sp+A1] ... mem_insnAN[sp+AN] sp=sp-4 // sp_insnB mem_insnB1[sp+B1] ... mem_insnBM[sp+BM] === ... in this example find_modifiable_mems() will arrange for mem_insnA* to be able to pass sp_insnA, and, while doing this, will create dependencies between all mem_insnA*s and sp_insnB -- because sp_insnB is a consumer of sp_insnA. After this sp_insnB will have N new backward dependencies. Then find_modifiable_mems() gets to mem_insnB*s and starts to create N new dependencies for _every_ mem_insnB*. This gets us N*M new dependencies.
In PR96833's testcase N and M are 10k-15k, which causes RAM usage of 30GB and compilation time of 30 minutes, with sched2 accounting for 95% of both metrics. After this patch the RAM usage is down to 1GB and compilation time is down to 3-4 minutes, with sched2 no longer standing out on -ftime-report or memory usage. gcc/ChangeLog: PR rtl-optimization/96388 PR rtl-optimization/111554 * sched-deps.cc (find_inc): Avoid exponential behavior. --- gcc/sched-deps.cc | 45 +++++++++++++++++++++++++++++++++++++++++---- 1 file changed, 41 insertions(+), 4 deletions(-) diff --git a/gcc/sched-deps.cc b/gcc/sched-deps.cc index c23218890f3..a286ff319e2 100644 --- a/gcc/sched-deps.cc +++ b/gcc/sched-deps.cc @@ -4779,24 +4779,59 @@ parse_add_or_inc (struct mem_inc_info *mii, rtx_insn *insn, bool before_mem) /* Once a suitable mem reference has been found and the corresponding data in MII has been filled in, this function is called to find a suitable add or inc insn involving the register we found in the memory - reference. */ + reference. + If successful, this function will create additional dependencies between + - mii->inc_insn's producers and mii->mem_insn as a consumer (if backwards) + - mii->inc_insn's consumers and mii->mem_insn as a producer (if !backwards). +*/ static bool find_inc (struct mem_inc_info *mii, bool backwards) { sd_iterator_def sd_it; dep_t dep; + sd_list_types_def mem_deps = backwards ? SD_LIST_HARD_BACK : SD_LIST_FORW; + int n_mem_deps = sd_lists_size (mii->mem_insn, mem_deps); - sd_it = sd_iterator_start (mii->mem_insn, - backwards ? SD_LIST_HARD_BACK : SD_LIST_FORW); + sd_it = sd_iterator_start (mii->mem_insn, mem_deps); while (sd_iterator_cond (&sd_it, &dep)) { dep_node_t node = DEP_LINK_NODE (*sd_it.linkp); rtx_insn *pro = DEP_PRO (dep); rtx_insn *con = DEP_CON (dep); - rtx_insn *inc_cand = backwards ? pro : con; + rtx_insn *inc_cand; + int n_inc_deps; + if (DEP_NONREG (dep) || DEP_MULTIPLE (dep)) goto next; + + if (backwards) + { + inc_cand = pro; + n_inc_deps = sd_lists_size (inc_cand, SD_LIST_BACK); + } + else + { + inc_cand = con; + n_inc_deps = sd_lists_size (inc_cand, SD_LIST_FORW); + } + + /* In the FOR_EACH_DEP loop below we will create additional n_inc_deps + for mem_insn. This by itself is not a problem, since each mem_insn + will have only a few inc_insns associated with it. However, if + we consider that a single inc_insn may have a lot of mem_insns, AND, + on top of that, a few other inc_insns associated with it -- + those _other inc_insns_ will get (n_mem_deps * number of MEM insns) + dependencies created for them. This may cause an exponential + growth of memory usage and scheduling time. + See PR96388 for details. + We [heuristically] use n_inc_deps as a proxy for the number of MEM + insns, and drop opportunities for breaking modifiable_mem dependencies + when dependency lists grow beyond reasonable size. */ + if (n_mem_deps * n_inc_deps + >= param_max_pending_list_length * param_max_pending_list_length) + goto next; + if (parse_add_or_inc (mii, inc_cand, backwards)) { struct dep_replacement *desc; @@ -4838,6 +4873,8 @@ find_inc (struct mem_inc_info *mii, bool backwards) desc->insn = mii->mem_insn; move_dep_link (DEP_NODE_BACK (node), INSN_HARD_BACK_DEPS (con), INSN_SPEC_BACK_DEPS (con)); + + gcc_assert (mii->inc_insn == inc_cand); if (backwards) { FOR_EACH_DEP (mii->inc_insn, SD_LIST_BACK, sd_it, dep) -- 2.34.1