Pressure senstive scheduling seems to prefer "wide" schedules with more parallelism tending to more spills. This works better for in-order cores [1][2].
The Excess Change Cost (ECC) of an insn, essentially a proxy of register pressure attributed to an insn, deliberately ignores negative values (Pressure reduction), making them 0 (neutral), leading to the above heuristic. However this heuristic induces sched1 spill frenzy on RISC-V, especially on SPEC2017 507.Cactu. If insn scheduling is disabled completely, the total dynamic icounts for this workload are reduced in half from ~2.5 trillion insns to ~1.3 (w/ -fno-schedule-insns). This patch allows for an opt-in target hook TARGET_SCHED_PRESSURE_PREFER_NARROW: - The default hook (returns false) preserves existing behavior of wider schedules, more parallelism and potentially more spills. - targets implementing the hook as true get the reverse effect. RISC-V backend implements this hook in next patch. [1] https://gcc.gnu.org/pipermail/gcc-patches/2024-August/659847.html [2] https://gcc.gnu.org/legacy-ml/gcc-patches/2011-12/msg01684.html gcc/ChangeLog: PR target/11472 * target.def (pressure_prefer_narrow): Add target hook. * doc/tm.texi.in: Add TARGET_SCHED_PRESSURE_PREFER_NARROW. * doc/tm.texi: Regenerated. * haifa-sched.cc (model_excess_group_cost): Return negative delta if targetm.sched.pressure_prefer_narrow returns true. (model_excess_cost): Ceil negative baseECC to 0 only if targetm.sched.pressure_prefer_narrow returns false. Signed-off-by: Vineet Gupta <vine...@rivosinc.com> --- gcc/doc/tm.texi | 11 +++++++++++ gcc/doc/tm.texi.in | 2 ++ gcc/haifa-sched.cc | 30 ++++++++++++++++++++++-------- gcc/target.def | 13 +++++++++++++ 4 files changed, 48 insertions(+), 8 deletions(-) diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi index 4deb3d2c283a..0f5255436c8b 100644 --- a/gcc/doc/tm.texi +++ b/gcc/doc/tm.texi @@ -7495,6 +7495,17 @@ This is the cleanup hook corresponding to @code{TARGET_SCHED_INIT_GLOBAL}. @var{verbose} is the verbose level provided by @option{-fsched-verbose-@var{n}}. @end deftypefn +@deftypefn {Target Hook} bool TARGET_SCHED_PRESSURE_PREFER_NARROW (void) +This hooks returns target boolean preference for narrow schedules (fewer +spills) vs. wider schdules (potentially more spills) in Pressure sensistive +Instruction scheduling. The algorithm is currently slightly biased towards +in-order cores thus favors wider schdules with more parallelism +(and spills). For certain targets, depending on ISA (number of registers, +addressing modes etc) the spilling can get excessive which this hook allows +to override. The default version of this hook returns @code{false} which +preserves existing spilling behavior. +@end deftypefn + @deftypefn {Target Hook} rtx TARGET_SCHED_DFA_PRE_CYCLE_INSN (void) The hook returns an RTL insn. The automaton state used in the pipeline hazard recognizer is changed as if the insn were scheduled diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in index 9f147ccb95cc..0f24344a272b 100644 --- a/gcc/doc/tm.texi.in +++ b/gcc/doc/tm.texi.in @@ -4786,6 +4786,8 @@ them: try the first ones in this list first. @hook TARGET_SCHED_FINISH_GLOBAL +@hook TARGET_SCHED_PRESSURE_PREFER_NARROW + @hook TARGET_SCHED_DFA_PRE_CYCLE_INSN @hook TARGET_SCHED_INIT_DFA_PRE_CYCLE_INSN diff --git a/gcc/haifa-sched.cc b/gcc/haifa-sched.cc index 1bc610f9a5f9..f8c42d30d5a4 100644 --- a/gcc/haifa-sched.cc +++ b/gcc/haifa-sched.cc @@ -2398,11 +2398,18 @@ model_excess_group_cost (struct model_pressure_group *group, int pressure, cl; cl = ira_pressure_classes[pci]; - if (delta < 0 && point >= group->limits[pci].point) + if (delta < 0) { - pressure = MAX (group->limits[pci].orig_pressure, - curr_reg_pressure[cl] + delta); - return -model_spill_cost (cl, pressure, curr_reg_pressure[cl]); + if (point >= group->limits[pci].point) + { + pressure = MAX (group->limits[pci].orig_pressure, + curr_reg_pressure[cl] + delta); + return -model_spill_cost (cl, pressure, curr_reg_pressure[cl]); + } + /* if target prefers fewer spills, return the -ve delta indicating + pressure reduction. */ + else if (targetm.sched.pressure_prefer_narrow ()) + return delta; } if (delta > 0) @@ -2453,7 +2460,7 @@ model_excess_cost (rtx_insn *insn, bool print_p) } if (print_p) - fprintf (sched_dump, "\n"); + fprintf (sched_dump, " ECC %d\n", cost); return cost; } @@ -2489,8 +2496,9 @@ model_set_excess_costs (rtx_insn **insns, int count) bool print_p; /* Record the baseECC value for each instruction in the model schedule, - except that negative costs are converted to zero ones now rather than - later. Do not assign a cost to debug instructions, since they must + except that for targets which prefer wider schedules (more spills) + negative costs are converted to zero ones now rather than later. + Do not assign a cost to debug instructions, since they must not change code-generation decisions. Experiments suggest we also get better results by not assigning a cost to instructions from a different block. @@ -2512,7 +2520,7 @@ model_set_excess_costs (rtx_insn **insns, int count) print_p = true; } cost = model_excess_cost (insns[i], print_p); - if (cost <= 0) + if (!targetm.sched.pressure_prefer_narrow () && cost <= 0) { priority = INSN_PRIORITY (insns[i]) - insn_delay (insns[i]) - cost; priority_base = MAX (priority_base, priority); @@ -2523,6 +2531,12 @@ model_set_excess_costs (rtx_insn **insns, int count) if (print_p) fprintf (sched_dump, MODEL_BAR); + /* If target prefers "narrow" schedules (less spills) avoid MAX (baseECC, 0) + which changes negative baseECC (pressure reduction) to 0 (neutral) thus + favoring spills. */ + if (targetm.sched.pressure_prefer_narrow ()) + return; + /* Use MAX (baseECC, 0) and baseP to calculcate ECC for each instruction. */ for (i = 0; i < count; i++) diff --git a/gcc/target.def b/gcc/target.def index b31550108883..19333591bbc5 100644 --- a/gcc/target.def +++ b/gcc/target.def @@ -1038,6 +1038,19 @@ DEFHOOK @var{verbose} is the verbose level provided by @option{-fsched-verbose-@var{n}}.", void, (FILE *file, int verbose), NULL) +DEFHOOK +(pressure_prefer_narrow, + "This hooks returns target boolean preference for narrow schedules (fewer\n\ +spills) vs. wider schdules (potentially more spills) in Pressure sensistive\n\ +Instruction scheduling. The algorithm is currently slightly biased towards\n\ +in-order cores thus favors wider schdules with more parallelism\n\ +(and spills). For certain targets, depending on ISA (number of registers,\n\ +addressing modes etc) the spilling can get excessive which this hook allows\n\ +to override. The default version of this hook returns @code{false} which\n\ +preserves existing spilling behavior.", + bool, (void), + hook_bool_void_false) + /* Reorder insns in a machine-dependent fashion, in two different places. Default does nothing. */ DEFHOOK -- 2.43.0