On 10/3/23 10:07, Surya Kumari Jangala wrote:
ira: Scale save/restore costs of callee save registers with block frequency

In assign_hard_reg(), when computing the costs of the hard registers, the
cost of saving/restoring a callee-save hard register in prolog/epilog is
taken into consideration. However, this cost is not scaled with the entry
block frequency. Without scaling, the cost of saving/restoring is quite
small and this can result in a callee-save register being chosen by
assign_hard_reg() even though there are free caller-save registers
available. Assigning a callee save register to a pseudo that is live
in the entire function and across a call will cause shrink wrap to fail.

Thank you for addressing this part of code.  Sometimes changes looking obvious have unpredicted results.  I remember experimenting with different heuristics for this code long time ago when 32-bit x86 target was the major one and this was the best variant I found.  Since a lot of changes happened since then, I decided to benchmark your change.

This change is increasing x86-64 spec2017 code size by 0.67% in average.  The increase is very stable for 20 spec2017 benchmarks. Only code for bwaves is smaller (by 0.01%).  The specfp2017 performance is the same.  There is one positive impact, specin2017 improved by 0.6% (8.59 vs 8.54) mainly because of improvement of xalamcbmk (2.5%) and exchange (5%).

So I propose to make this change only when it is not an optimization for the code size.  Also please be prepared that there might be testsuite failures on other targets: some targets are overconstrained by tests expecting specific generated code.

2023-10-03  Surya Kumari Jangala  <jskum...@linux.ibm.com>

gcc/
        PR rtl-optimization/111673
        * ira-color.cc (assign_hard_reg): Scale save/restore costs of
        callee save registers with block frequency.

gcc/testsuite/
        PR rtl-optimization/111673
        * gcc.target/powerpc/pr111673/c: New test.
---

diff --git a/gcc/ira-color.cc b/gcc/ira-color.cc
index f2e8ea34152..eb20c52310d 100644
--- a/gcc/ira-color.cc
+++ b/gcc/ira-color.cc
@@ -2175,7 +2175,8 @@ assign_hard_reg (ira_allocno_t a, bool retry_p)
            add_cost = ((ira_memory_move_cost[mode][rclass][0]
                         + ira_memory_move_cost[mode][rclass][1])
                        * saved_nregs / hard_regno_nregs (hard_regno,
-                                                         mode) - 1);
+                                                         mode) - 1)
+                       * REG_FREQ_FROM_BB (ENTRY_BLOCK_PTR_FOR_FN (cfun));
            cost += add_cost;
            full_cost += add_cost;
          }
diff --git a/gcc/testsuite/gcc.target/powerpc/pr111673.c 
b/gcc/testsuite/gcc.target/powerpc/pr111673.c
new file mode 100644
index 00000000000..e0c0f85460a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr111673.c
@@ -0,0 +1,17 @@
+/* { dg-do compile { target lp64 } } */
+/* { dg-options "-O2 -fdump-rtl-pro_and_epilogue" } */
+
+/* Verify there is an early return without the prolog and shrink-wrap
+   the function. */
+
+int f (int);
+int
+advance (int dz)
+{
+  if (dz > 0)
+    return (dz + dz) * dz;
+  else
+    return dz * f (dz);
+}
+
+/* { dg-final { scan-rtl-dump-times "Performing shrink-wrapping" 1 
"pro_and_epilogue" } } */


Reply via email to