[PATCH 1/4] sched1: hookize pressure scheduling spilling agressiveness

Vineet Gupta Sun, 20 Oct 2024 12:40:44 -0700

Pressure senstive scheduling seems to prefer "wide" schedules with more
parallelism tending to more spills. This works better for in-order
cores [1][2].


The Excess Change Cost (ECC) of an insn, essentially a proxy of register
pressure attributed to an insn, deliberately ignores negative values
(Pressure reduction), making them 0 (neutral), leading to the above
heuristic.

However this heuristic induces sched1 spill frenzy on RISC-V, especially
on SPEC2017 507.Cactu. If insn scheduling is disabled completely, the
total dynamic icounts for this workload are reduced in half from
~2.5 trillion insns to ~1.3 (w/ -fno-schedule-insns).

This patch allows for an opt-in target hook
TARGET_SCHED_PRESSURE_PREFER_NARROW:

 - The default hook (returns false) preserves existing behavior of wider
   schedules, more parallelism and potentially more spills.

 - targets implementing the hook as true get the reverse effect.

RISC-V backend implements this hook in next patch.

[1] https://gcc.gnu.org/pipermail/gcc-patches/2024-August/659847.html
[2] https://gcc.gnu.org/legacy-ml/gcc-patches/2011-12/msg01684.html

gcc/ChangeLog:
        PR target/11472
        * target.def (pressure_prefer_narrow): Add target hook.
        * doc/tm.texi.in: Add TARGET_SCHED_PRESSURE_PREFER_NARROW.
        * doc/tm.texi: Regenerated.
        * haifa-sched.cc (model_excess_group_cost): Return negative
        delta if targetm.sched.pressure_prefer_narrow returns true.
        (model_excess_cost): Ceil negative baseECC to 0 only if
        targetm.sched.pressure_prefer_narrow returns false.

Signed-off-by: Vineet Gupta <vine...@rivosinc.com>
---
 gcc/doc/tm.texi    | 11 +++++++++++
 gcc/doc/tm.texi.in |  2 ++
 gcc/haifa-sched.cc | 30 ++++++++++++++++++++++--------
 gcc/target.def     | 13 +++++++++++++
 4 files changed, 48 insertions(+), 8 deletions(-)

diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index 4deb3d2c283a..0f5255436c8b 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -7495,6 +7495,17 @@ This is the cleanup hook corresponding to 
@code{TARGET_SCHED_INIT_GLOBAL}.
 @var{verbose} is the verbose level provided by 
@option{-fsched-verbose-@var{n}}.
 @end deftypefn
 
+@deftypefn {Target Hook} bool TARGET_SCHED_PRESSURE_PREFER_NARROW (void)
+This hooks returns target boolean preference for narrow schedules (fewer
+spills) vs. wider schdules (potentially more spills) in Pressure sensistive
+Instruction scheduling.  The algorithm is currently slightly biased towards
+in-order cores thus favors wider schdules with more parallelism
+(and spills).  For certain targets, depending on ISA (number of registers,
+addressing modes etc) the spilling can get excessive which this hook allows
+to override.  The default version of this hook returns @code{false} which
+preserves existing spilling behavior.
+@end deftypefn
+
 @deftypefn {Target Hook} rtx TARGET_SCHED_DFA_PRE_CYCLE_INSN (void)
 The hook returns an RTL insn.  The automaton state used in the
 pipeline hazard recognizer is changed as if the insn were scheduled
diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
index 9f147ccb95cc..0f24344a272b 100644
--- a/gcc/doc/tm.texi.in
+++ b/gcc/doc/tm.texi.in
@@ -4786,6 +4786,8 @@ them: try the first ones in this list first.
 
 @hook TARGET_SCHED_FINISH_GLOBAL
 
+@hook TARGET_SCHED_PRESSURE_PREFER_NARROW
+
 @hook TARGET_SCHED_DFA_PRE_CYCLE_INSN
 
 @hook TARGET_SCHED_INIT_DFA_PRE_CYCLE_INSN
diff --git a/gcc/haifa-sched.cc b/gcc/haifa-sched.cc
index 1bc610f9a5f9..f8c42d30d5a4 100644
--- a/gcc/haifa-sched.cc
+++ b/gcc/haifa-sched.cc
@@ -2398,11 +2398,18 @@ model_excess_group_cost (struct model_pressure_group 
*group,
   int pressure, cl;
 
   cl = ira_pressure_classes[pci];
-  if (delta < 0 && point >= group->limits[pci].point)
+  if (delta < 0)
     {
-      pressure = MAX (group->limits[pci].orig_pressure,
-                     curr_reg_pressure[cl] + delta);
-      return -model_spill_cost (cl, pressure, curr_reg_pressure[cl]);
+      if (point >= group->limits[pci].point)
+       {
+         pressure = MAX (group->limits[pci].orig_pressure,
+                         curr_reg_pressure[cl] + delta);
+         return -model_spill_cost (cl, pressure, curr_reg_pressure[cl]);
+       }
+      /* if target prefers fewer spills, return the -ve delta indicating
+        pressure reduction.  */
+      else if (targetm.sched.pressure_prefer_narrow ())
+         return delta;
     }
 
   if (delta > 0)
@@ -2453,7 +2460,7 @@ model_excess_cost (rtx_insn *insn, bool print_p)
     }
 
   if (print_p)
-    fprintf (sched_dump, "\n");
+    fprintf (sched_dump, " ECC %d\n", cost);
 
   return cost;
 }
@@ -2489,8 +2496,9 @@ model_set_excess_costs (rtx_insn **insns, int count)
   bool print_p;
 
   /* Record the baseECC value for each instruction in the model schedule,
-     except that negative costs are converted to zero ones now rather than
-     later.  Do not assign a cost to debug instructions, since they must
+     except that for targets which prefer wider schedules (more spills)
+     negative costs are converted to zero ones now rather than later.
+     Do not assign a cost to debug instructions, since they must
      not change code-generation decisions.  Experiments suggest we also
      get better results by not assigning a cost to instructions from
      a different block.
@@ -2512,7 +2520,7 @@ model_set_excess_costs (rtx_insn **insns, int count)
            print_p = true;
          }
        cost = model_excess_cost (insns[i], print_p);
-       if (cost <= 0)
+       if (!targetm.sched.pressure_prefer_narrow () && cost <= 0)
          {
            priority = INSN_PRIORITY (insns[i]) - insn_delay (insns[i]) - cost;
            priority_base = MAX (priority_base, priority);
@@ -2523,6 +2531,12 @@ model_set_excess_costs (rtx_insn **insns, int count)
   if (print_p)
     fprintf (sched_dump, MODEL_BAR);
 
+  /* If target prefers "narrow" schedules (less spills) avoid MAX (baseECC, 0)
+     which changes negative baseECC (pressure reduction) to 0 (neutral) thus
+     favoring spills.  */
+  if (targetm.sched.pressure_prefer_narrow ())
+    return;
+
   /* Use MAX (baseECC, 0) and baseP to calculcate ECC for each
      instruction.  */
   for (i = 0; i < count; i++)
diff --git a/gcc/target.def b/gcc/target.def
index b31550108883..19333591bbc5 100644
--- a/gcc/target.def
+++ b/gcc/target.def
@@ -1038,6 +1038,19 @@ DEFHOOK
 @var{verbose} is the verbose level provided by 
@option{-fsched-verbose-@var{n}}.",
  void, (FILE *file, int verbose), NULL)
 
+DEFHOOK
+(pressure_prefer_narrow,
+ "This hooks returns target boolean preference for narrow schedules (fewer\n\
+spills) vs. wider schdules (potentially more spills) in Pressure sensistive\n\
+Instruction scheduling.  The algorithm is currently slightly biased towards\n\
+in-order cores thus favors wider schdules with more parallelism\n\
+(and spills).  For certain targets, depending on ISA (number of registers,\n\
+addressing modes etc) the spilling can get excessive which this hook allows\n\
+to override.  The default version of this hook returns @code{false} which\n\
+preserves existing spilling behavior.",
+ bool, (void),
+ hook_bool_void_false)
+
 /* Reorder insns in a machine-dependent fashion, in two different
        places.  Default does nothing.  */
 DEFHOOK
-- 
2.43.0

[PATCH 1/4] sched1: hookize pressure scheduling spilling agressiveness

Reply via email to