https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96388
Richard Biener <rguenth at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |law at gcc dot gnu.org --- Comment #11 from Richard Biener <rguenth at gcc dot gnu.org> --- (In reply to Richard Biener from comment #10) > The partially reduced (In reply to Martin Liška from comment #9) > > Created attachment 48962 [details] > > Partially reduced test-case > > > > The reduction is quite stuck at this point. > > No longer keys on -fPIC though, so the bisection for this is likely wrong. > -fno-schedule-insns2 improves it from 18s to 5s compile time and from > 1.1GB of peak RSS to 320MB. > > scheduling 2 : 12.69 ( 71%) 0.10 ( 67%) 12.79 ( > 70%) 11128 kB ( 16%) > > -fmem-report doesn't show anything interesting, looking for heap allocations > now to find the offender. > > Can you bisect your reduced testcase again? GCC 8.4 behaves the same for it > rather than being good but GCC 4.8.5 is fine. For the testcase most time is spent in constrain_operands and update_conflict_hard_regno_costs. It looks like the main issue is a very large chain of dependences and thus going from 27000 schedule_insn calls to 10 000 000 calls to try_ready which means the sd_iterator iterates over many dependent instructions, not stopping at "common dependences". That's likely also the source of the memory use (the dn_pool), though memory reporting with --enable-gather-detailed-mem-stats doesn't seem to work for this pool? dep_node sched-deps.c:4107 (sched_deps_init) 1 0 : 0.0% 0 0 : 0.0% 80 deps_list sched-deps.c:4105 (sched_deps_init) 1 0 : 0.0% 2179k 136k: 0.9% 16 There's also 10 million dep_replacement nodes which are all allocated via XCNEW ... another object_allocator would be more efficient here I guess. Could it be that sched-deps makes a tree out of a dependence graph? CCing the only active haifa scheduler maintainer...