On 04/12/24 19:48 +0000, Jonathan Wakely wrote:
On 04/12/24 11:03 -0800, Vineet Gupta wrote:
sched1 computes ECC (Excess Change Cost) for each insn, which represents
the register pressure attributed to the insn.
Currently the pressure sensitive scheduling algorithm deliberately ignores
negative ECC values (pressure reduction), making them 0 (neutral), leading
to more spills. This happens due to the assumption that the compiler has
a reasonably accurate processor pipeline scheduling model and thus tries
to aggresively fill pipeline bubbles with spill slots.

This however might not be true, as the model might not be available for
certains uarches or even applicable especially for modern out-of-order cores.

The existing heuristic induces spill frenzy on RISC-V, noticably so on
SPEC2017 507.Cactu. If insn scheduling is disabled completely, the
total dynamic icounts for this workload are reduced in half from
~2.5 trillion insns to ~1.3 (w/ -fno-schedule-insns).

This patch adds --param=cycle-accurate-model={0,1} to gate the spill
behavior.

- The default (1) preserves existing spill behavior.

- targets/uarches sensitive to spilling can override the param to (0)
 to get the reverse effect. RISC-V backend does so too.

The actual perf numbers are very promising.

(1) On RISC-V BPI-F3 in-order CPU, -Ofast -march=rv64gcv_zba_zbb_zbs:

Before:
------
Performance counter stats for './cactusBSSN_r_base.rivos spec_ref.par':

    4,917,712.97 msec task-clock:u                     #    1.000 CPUs utilized
           5,314      context-switches:u               #    1.081 /sec
               3      cpu-migrations:u                 #    0.001 /sec
         204,784      page-faults:u                    #   41.642 /sec
7,868,291,222,513      cycles:u                         #    1.600 GHz
2,615,069,866,153      instructions:u                   #    0.33  insn per 
cycle
  10,799,381,890      branches:u                       #    2.196 M/sec
      15,714,572      branch-misses:u                  #    0.15% of all 
branches

After:
-----
Performance counter stats for './cactusBSSN_r_base.rivos spec_ref.par':

    4,552,979.58 msec task-clock:u                     #    0.998 CPUs utilized
         205,020      context-switches:u               #   45.030 /sec
               2      cpu-migrations:u                 #    0.000 /sec
         204,221      page-faults:u                    #   44.854 /sec
7,285,176,204,764      cycles:u        (7.4% faster)    #    1.600 GHz
2,145,284,345,397      instructions:u (17.96% fewer)    #    0.29  insn per 
cycle
  10,799,382,011      branches:u                       #    2.372 M/sec
      16,235,628      branch-misses:u                  #    0.15% of all 
branches

(2) Wilco reported 20% perf gains on aarch64 Neoverse V2 runs.

gcc/ChangeLog:
        PR target/11472

Note that you typo'd the PR number here, so that it added a comment
to https://gcc.gnu.org/bugzilla/show_bug.cgi?id=11472

The server-side git hooks that check commit logs didn't notice it,
because you used [PR/114729] in the subject line, which should be in
the form [PR114729]. If you'd used the correct form, the hooks would
have told you that the two PR numbers didn't match.

I see you've been using the PR/nnn form for all your commits, please
use the [PRnnn] form as described at https://gcc.gnu.org/contribute.html#patches


Also it looks like the actual component in bugzilla is
"rtl-optimization" not "target", so shouldn't the ChangeLog entry use
"PR rtl-optimization/114729" ?

Reply via email to