On Mon, Feb 3, 2025 at 6:29 PM Richard Sandiford
<richard.sandif...@arm.com> wrote:
>
> Richard Biener <richard.guent...@gmail.com> writes:
> > On Mon, Feb 3, 2025 at 7:23 AM H.J. Lu <hjl.to...@gmail.com> wrote:
> >>
> >> commit 3b9b8d6cfdf59337f4b7ce10ce92a98044b2657b
> >> Author: Surya Kumari Jangala <jskum...@linux.ibm.com>
> >> Date:   Tue Jun 25 08:37:49 2024 -0500
> >>
> >>     ira: Scale save/restore costs of callee save registers with block 
> >> frequency
> >>
> >> scales the cost of saving/restoring a callee-save hard register in epilogue
> >> and prologue with the entry block frequency, which, if not optimizing for
> >> size, is 10000, for all targets.  As the result, callee-saved registers
> >> may not be used to preserve local variable values across calls on some
> >> targets, like x86.  Add a target hook for the callee-saved register cost
> >> scale in epilogue and prologue used by IRA.  The default version of this
> >> target hook returns 1 if optimizing for size, otherwise returns the entry
> >> block frequency.  Add an x86 version of this target hook to restore the
> >> old behavior prior to the above commit.
> >>
> >>         PR rtl-optimization/111673
> >>         PR rtl-optimization/115932
> >>         PR rtl-optimization/116028
> >>         PR rtl-optimization/117081
> >>         PR rtl-optimization/117082
> >>         PR rtl-optimization/118497
> >>         * ira-color.cc (assign_hard_reg): Call the target hook for the
> >>         callee-saved register cost scale in epilogue and prologue.
> >>         * target.def (ira_callee_saved_register_cost_scale): New target
> >>         hook.
> >>         * targhooks.cc (default_ira_callee_saved_register_cost_scale):
> >>         New.
> >>         * targhooks.h (default_ira_callee_saved_register_cost_scale):
> >>         Likewise.
> >>         * config/i386/i386.cc (ix86_ira_callee_saved_register_cost_scale):
> >>         New.
> >>         (TARGET_IRA_CALLEE_SAVED_REGISTER_COST_SCALE): Likewise.
> >>         * doc/tm.texi: Regenerated.
> >>         * doc/tm.texi.in (TARGET_IRA_CALLEE_SAVED_REGISTER_COST_SCALE):
> >>         New.
> >>
> >> Signed-off-by: H.J. Lu <hjl.to...@gmail.com>
> >> ---
> >>  gcc/config/i386/i386.cc | 11 +++++++++++
> >>  gcc/doc/tm.texi         |  8 ++++++++
> >>  gcc/doc/tm.texi.in      |  2 ++
> >>  gcc/ira-color.cc        |  3 +--
> >>  gcc/target.def          | 12 ++++++++++++
> >>  gcc/targhooks.cc        |  8 ++++++++
> >>  gcc/targhooks.h         |  1 +
> >>  7 files changed, 43 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
> >> index f89201684a8..3128973ba79 100644
> >> --- a/gcc/config/i386/i386.cc
> >> +++ b/gcc/config/i386/i386.cc
> >> @@ -20600,6 +20600,14 @@ ix86_class_likely_spilled_p (reg_class_t rclass)
> >>    return false;
> >>  }
> >>
> >> +/* Implement TARGET_IRA_CALLEE_SAVED_REGISTER_COST_SCALE.  */
> >> +
> >> +static int
> >> +ix86_ira_callee_saved_register_cost_scale (int)
> >> +{
> >> +  return 1;
> >> +}
> >> +
> >>  /* Return true if a set of DST by the expression SRC should be allowed.
> >>     This prevents complex sets of likely_spilled hard regs before split1.  
> >> */
> >>
> >> @@ -27078,6 +27086,9 @@ ix86_libgcc_floating_mode_supported_p
> >>  #define TARGET_PREFERRED_OUTPUT_RELOAD_CLASS 
> >> ix86_preferred_output_reload_class
> >>  #undef TARGET_CLASS_LIKELY_SPILLED_P
> >>  #define TARGET_CLASS_LIKELY_SPILLED_P ix86_class_likely_spilled_p
> >> +#undef TARGET_IRA_CALLEE_SAVED_REGISTER_COST_SCALE
> >> +#define TARGET_IRA_CALLEE_SAVED_REGISTER_COST_SCALE \
> >> +  ix86_ira_callee_saved_register_cost_scale
> >>
> >>  #undef TARGET_VECTORIZE_BUILTIN_VECTORIZATION_COST
> >>  #define TARGET_VECTORIZE_BUILTIN_VECTORIZATION_COST \
> >> diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
> >> index 0de24eda6f0..9f42913a4ef 100644
> >> --- a/gcc/doc/tm.texi
> >> +++ b/gcc/doc/tm.texi
> >> @@ -3047,6 +3047,14 @@ A target hook which can change allocno class for 
> >> given pseudo from
> >>    The default version of this target hook always returns given class.
> >>  @end deftypefn
> >>
> >> +@deftypefn {Target Hook} int TARGET_IRA_CALLEE_SAVED_REGISTER_COST_SCALE 
> >> (int @var{hard_regno})
> >> +A target hook which returns the callee-saved register @var{hard_regno}
> >> +cost scale in epilogue and prologue used by IRA.
> >> +
> >> +The default version of this target hook returns 1 if optimizing for
> >> +size, otherwise returns the entry block frequency.
> >> +@end deftypefn
> >> +
> >>  @deftypefn {Target Hook} bool TARGET_LRA_P (void)
> >>  A target hook which returns true if we use LRA instead of reload pass.
> >>
> >> diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
> >> index 631d04131e3..6dbe22581ca 100644
> >> --- a/gcc/doc/tm.texi.in
> >> +++ b/gcc/doc/tm.texi.in
> >> @@ -2388,6 +2388,8 @@ in the reload pass.
> >>
> >>  @hook TARGET_IRA_CHANGE_PSEUDO_ALLOCNO_CLASS
> >>
> >> +@hook TARGET_IRA_CALLEE_SAVED_REGISTER_COST_SCALE
> >> +
> >>  @hook TARGET_LRA_P
> >>
> >>  @hook TARGET_REGISTER_PRIORITY
> >> diff --git a/gcc/ira-color.cc b/gcc/ira-color.cc
> >> index 0699b349a1a..233060e1587 100644
> >> --- a/gcc/ira-color.cc
> >> +++ b/gcc/ira-color.cc
> >> @@ -2180,8 +2180,7 @@ assign_hard_reg (ira_allocno_t a, bool retry_p)
> >>                          + ira_memory_move_cost[mode][rclass][1])
> >>                         * saved_nregs / hard_regno_nregs (hard_regno,
> >>                                                           mode) - 1)
> >> -                      * (optimize_size ? 1 :
> >> -                         REG_FREQ_FROM_BB (ENTRY_BLOCK_PTR_FOR_FN 
> >> (cfun)));
> >> +                      * targetm.ira_callee_saved_register_cost_scale 
> >> (hard_regno);
> >>             cost += add_cost;
> >>             full_cost += add_cost;
> >>           }
> >> diff --git a/gcc/target.def b/gcc/target.def
> >> index 4050b2ebdd4..c348b15815a 100644
> >> --- a/gcc/target.def
> >> +++ b/gcc/target.def
> >> @@ -5714,6 +5714,18 @@ DEFHOOK
> >>   reg_class_t, (int, reg_class_t, reg_class_t),
> >>   default_ira_change_pseudo_allocno_class)
> >>
> >> +/* Scale of callee-saved register cost in epilogue and prologue used by
> >> +   IRA.  */
> >> +DEFHOOK
> >> +(ira_callee_saved_register_cost_scale,
> >> + "A target hook which returns the callee-saved register 
> >> @var{hard_regno}\n\
> >> +cost scale in epilogue and prologue used by IRA.\n\
> >> +\n\
> >> +The default version of this target hook returns 1 if optimizing for\n\
> >> +size, otherwise returns the entry block frequency.",
> >> + int, (int hard_regno),
> >> + default_ira_callee_saved_register_cost_scale)
> >> +
> >>  /* Return true if we use LRA instead of reload.  */
> >>  DEFHOOK
> >>  (lra_p,
> >> diff --git a/gcc/targhooks.cc b/gcc/targhooks.cc
> >> index f80dc8b2e7e..344075efa41 100644
> >> --- a/gcc/targhooks.cc
> >> +++ b/gcc/targhooks.cc
> >> @@ -1305,6 +1305,14 @@ default_ira_change_pseudo_allocno_class (int regno 
> >> ATTRIBUTE_UNUSED,
> >>    return cl;
> >>  }
> >>
> >> +int
> >> +default_ira_callee_saved_register_cost_scale (int)
> >> +{
> >> +  return (optimize_size
> >> +         ? 1
> >
> > Why use optimize_size when REG_FREQ_FROM_BB uses
> >
> > ((optimize_function_for_size_p (cfun)            \
> >  || !cfun->cfg->count_max.initialized_p ())
> >
> > ?  You are changing behavior for all targets this way?  Did you consider
> > altering REG_FREQ_FROM_BB to
> >
> > #define REG_FREQ_FROM_BB(bb, size_value)
> > ((optimize_function_for_size_p (cfun)            \
> >                                || !cfun->cfg->count_max.initialized_p ())   
> >   \
> >                               ? default                                  \
> >                               : ((bb)->count.to_frequency (cfun)            
> >   \
> >                                 * REG_FREQ_MAX / BB_FREQ_MAX)               
> >   \
> >                               ? ((bb)->count.to_frequency (cfun)            
> >   \
> >                                  * REG_FREQ_MAX / BB_FREQ_MAX)              
> >   \
> >                               : 1)
> >
> > so return a specified value for the optimize_size case?  (see the
> > other thread where
> > I wonder about that odd !cfun->cfg->count_max.initialized_p () check)
> >
> > IMO at this point a new target hook should preserve existing behavior by 
> > default
> > or alternatively the original patch should be reverted as causing
> > regressions and
> > a new patch introducing the target hook should be installed in next stage1.
>
> I don't think we should add a new target hook unless it's providing
> genuinely new information about the target.  Hooking into the RA to
> brute-force a particular heuristic makes it harder to improve the RA
> in future.
>
> There are already hooks that provide the costs of the relevant operations,
> so I think we should concentrate on using those to get good results for
> both Power and x86.

It isn't just about Power and x86.

commit 3b9b8d6cfdf59337f4b7ce10ce92a98044b2657b
Author: Surya Kumari Jangala <jskum...@linux.ibm.com>
Date:   Tue Jun 25 08:37:49 2024 -0500

    ira: Scale save/restore costs of callee save registers with block frequency

caused regressions on many targets, including aarch64:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116028

I don't understand why frequency can be used to scale the
cost.

-- 
H.J.

Reply via email to