On Mon, Feb 3, 2025 at 6:29 PM Richard Sandiford <richard.sandif...@arm.com> wrote: > > Richard Biener <richard.guent...@gmail.com> writes: > > On Mon, Feb 3, 2025 at 7:23 AM H.J. Lu <hjl.to...@gmail.com> wrote: > >> > >> commit 3b9b8d6cfdf59337f4b7ce10ce92a98044b2657b > >> Author: Surya Kumari Jangala <jskum...@linux.ibm.com> > >> Date: Tue Jun 25 08:37:49 2024 -0500 > >> > >> ira: Scale save/restore costs of callee save registers with block > >> frequency > >> > >> scales the cost of saving/restoring a callee-save hard register in epilogue > >> and prologue with the entry block frequency, which, if not optimizing for > >> size, is 10000, for all targets. As the result, callee-saved registers > >> may not be used to preserve local variable values across calls on some > >> targets, like x86. Add a target hook for the callee-saved register cost > >> scale in epilogue and prologue used by IRA. The default version of this > >> target hook returns 1 if optimizing for size, otherwise returns the entry > >> block frequency. Add an x86 version of this target hook to restore the > >> old behavior prior to the above commit. > >> > >> PR rtl-optimization/111673 > >> PR rtl-optimization/115932 > >> PR rtl-optimization/116028 > >> PR rtl-optimization/117081 > >> PR rtl-optimization/117082 > >> PR rtl-optimization/118497 > >> * ira-color.cc (assign_hard_reg): Call the target hook for the > >> callee-saved register cost scale in epilogue and prologue. > >> * target.def (ira_callee_saved_register_cost_scale): New target > >> hook. > >> * targhooks.cc (default_ira_callee_saved_register_cost_scale): > >> New. > >> * targhooks.h (default_ira_callee_saved_register_cost_scale): > >> Likewise. > >> * config/i386/i386.cc (ix86_ira_callee_saved_register_cost_scale): > >> New. > >> (TARGET_IRA_CALLEE_SAVED_REGISTER_COST_SCALE): Likewise. > >> * doc/tm.texi: Regenerated. > >> * doc/tm.texi.in (TARGET_IRA_CALLEE_SAVED_REGISTER_COST_SCALE): > >> New. > >> > >> Signed-off-by: H.J. Lu <hjl.to...@gmail.com> > >> --- > >> gcc/config/i386/i386.cc | 11 +++++++++++ > >> gcc/doc/tm.texi | 8 ++++++++ > >> gcc/doc/tm.texi.in | 2 ++ > >> gcc/ira-color.cc | 3 +-- > >> gcc/target.def | 12 ++++++++++++ > >> gcc/targhooks.cc | 8 ++++++++ > >> gcc/targhooks.h | 1 + > >> 7 files changed, 43 insertions(+), 2 deletions(-) > >> > >> diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc > >> index f89201684a8..3128973ba79 100644 > >> --- a/gcc/config/i386/i386.cc > >> +++ b/gcc/config/i386/i386.cc > >> @@ -20600,6 +20600,14 @@ ix86_class_likely_spilled_p (reg_class_t rclass) > >> return false; > >> } > >> > >> +/* Implement TARGET_IRA_CALLEE_SAVED_REGISTER_COST_SCALE. */ > >> + > >> +static int > >> +ix86_ira_callee_saved_register_cost_scale (int) > >> +{ > >> + return 1; > >> +} > >> + > >> /* Return true if a set of DST by the expression SRC should be allowed. > >> This prevents complex sets of likely_spilled hard regs before split1. > >> */ > >> > >> @@ -27078,6 +27086,9 @@ ix86_libgcc_floating_mode_supported_p > >> #define TARGET_PREFERRED_OUTPUT_RELOAD_CLASS > >> ix86_preferred_output_reload_class > >> #undef TARGET_CLASS_LIKELY_SPILLED_P > >> #define TARGET_CLASS_LIKELY_SPILLED_P ix86_class_likely_spilled_p > >> +#undef TARGET_IRA_CALLEE_SAVED_REGISTER_COST_SCALE > >> +#define TARGET_IRA_CALLEE_SAVED_REGISTER_COST_SCALE \ > >> + ix86_ira_callee_saved_register_cost_scale > >> > >> #undef TARGET_VECTORIZE_BUILTIN_VECTORIZATION_COST > >> #define TARGET_VECTORIZE_BUILTIN_VECTORIZATION_COST \ > >> diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi > >> index 0de24eda6f0..9f42913a4ef 100644 > >> --- a/gcc/doc/tm.texi > >> +++ b/gcc/doc/tm.texi > >> @@ -3047,6 +3047,14 @@ A target hook which can change allocno class for > >> given pseudo from > >> The default version of this target hook always returns given class. > >> @end deftypefn > >> > >> +@deftypefn {Target Hook} int TARGET_IRA_CALLEE_SAVED_REGISTER_COST_SCALE > >> (int @var{hard_regno}) > >> +A target hook which returns the callee-saved register @var{hard_regno} > >> +cost scale in epilogue and prologue used by IRA. > >> + > >> +The default version of this target hook returns 1 if optimizing for > >> +size, otherwise returns the entry block frequency. > >> +@end deftypefn > >> + > >> @deftypefn {Target Hook} bool TARGET_LRA_P (void) > >> A target hook which returns true if we use LRA instead of reload pass. > >> > >> diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in > >> index 631d04131e3..6dbe22581ca 100644 > >> --- a/gcc/doc/tm.texi.in > >> +++ b/gcc/doc/tm.texi.in > >> @@ -2388,6 +2388,8 @@ in the reload pass. > >> > >> @hook TARGET_IRA_CHANGE_PSEUDO_ALLOCNO_CLASS > >> > >> +@hook TARGET_IRA_CALLEE_SAVED_REGISTER_COST_SCALE > >> + > >> @hook TARGET_LRA_P > >> > >> @hook TARGET_REGISTER_PRIORITY > >> diff --git a/gcc/ira-color.cc b/gcc/ira-color.cc > >> index 0699b349a1a..233060e1587 100644 > >> --- a/gcc/ira-color.cc > >> +++ b/gcc/ira-color.cc > >> @@ -2180,8 +2180,7 @@ assign_hard_reg (ira_allocno_t a, bool retry_p) > >> + ira_memory_move_cost[mode][rclass][1]) > >> * saved_nregs / hard_regno_nregs (hard_regno, > >> mode) - 1) > >> - * (optimize_size ? 1 : > >> - REG_FREQ_FROM_BB (ENTRY_BLOCK_PTR_FOR_FN > >> (cfun))); > >> + * targetm.ira_callee_saved_register_cost_scale > >> (hard_regno); > >> cost += add_cost; > >> full_cost += add_cost; > >> } > >> diff --git a/gcc/target.def b/gcc/target.def > >> index 4050b2ebdd4..c348b15815a 100644 > >> --- a/gcc/target.def > >> +++ b/gcc/target.def > >> @@ -5714,6 +5714,18 @@ DEFHOOK > >> reg_class_t, (int, reg_class_t, reg_class_t), > >> default_ira_change_pseudo_allocno_class) > >> > >> +/* Scale of callee-saved register cost in epilogue and prologue used by > >> + IRA. */ > >> +DEFHOOK > >> +(ira_callee_saved_register_cost_scale, > >> + "A target hook which returns the callee-saved register > >> @var{hard_regno}\n\ > >> +cost scale in epilogue and prologue used by IRA.\n\ > >> +\n\ > >> +The default version of this target hook returns 1 if optimizing for\n\ > >> +size, otherwise returns the entry block frequency.", > >> + int, (int hard_regno), > >> + default_ira_callee_saved_register_cost_scale) > >> + > >> /* Return true if we use LRA instead of reload. */ > >> DEFHOOK > >> (lra_p, > >> diff --git a/gcc/targhooks.cc b/gcc/targhooks.cc > >> index f80dc8b2e7e..344075efa41 100644 > >> --- a/gcc/targhooks.cc > >> +++ b/gcc/targhooks.cc > >> @@ -1305,6 +1305,14 @@ default_ira_change_pseudo_allocno_class (int regno > >> ATTRIBUTE_UNUSED, > >> return cl; > >> } > >> > >> +int > >> +default_ira_callee_saved_register_cost_scale (int) > >> +{ > >> + return (optimize_size > >> + ? 1 > > > > Why use optimize_size when REG_FREQ_FROM_BB uses > > > > ((optimize_function_for_size_p (cfun) \ > > || !cfun->cfg->count_max.initialized_p ()) > > > > ? You are changing behavior for all targets this way? Did you consider > > altering REG_FREQ_FROM_BB to > > > > #define REG_FREQ_FROM_BB(bb, size_value) > > ((optimize_function_for_size_p (cfun) \ > > || !cfun->cfg->count_max.initialized_p ()) > > \ > > ? default \ > > : ((bb)->count.to_frequency (cfun) > > \ > > * REG_FREQ_MAX / BB_FREQ_MAX) > > \ > > ? ((bb)->count.to_frequency (cfun) > > \ > > * REG_FREQ_MAX / BB_FREQ_MAX) > > \ > > : 1) > > > > so return a specified value for the optimize_size case? (see the > > other thread where > > I wonder about that odd !cfun->cfg->count_max.initialized_p () check) > > > > IMO at this point a new target hook should preserve existing behavior by > > default > > or alternatively the original patch should be reverted as causing > > regressions and > > a new patch introducing the target hook should be installed in next stage1. > > I don't think we should add a new target hook unless it's providing > genuinely new information about the target. Hooking into the RA to > brute-force a particular heuristic makes it harder to improve the RA > in future. > > There are already hooks that provide the costs of the relevant operations, > so I think we should concentrate on using those to get good results for > both Power and x86.
It isn't just about Power and x86. commit 3b9b8d6cfdf59337f4b7ce10ce92a98044b2657b Author: Surya Kumari Jangala <jskum...@linux.ibm.com> Date: Tue Jun 25 08:37:49 2024 -0500 ira: Scale save/restore costs of callee save registers with block frequency caused regressions on many targets, including aarch64: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116028 I don't understand why frequency can be used to scale the cost. -- H.J.