Ping !
On 10/20/24 12:40, Vineet Gupta wrote:
> Hi,
>
> PFA patch series which improves sched1 spilling. This all started with
> SPEC2017 507.Cactu dynamic icounts on RISC-V being double than those of
> aarch64 (~2.6 trillion vs. ~1.4 trillion). Robin/Jeff hinted that the
> issue could be sched1 which it turned out to be.
>
> Essentially there are 2 fixes
>
> - Patch 1/4 improves the main list schedular outcomes by not
> watering down negative pressure change to zero. It implements
> a target hook, which is seperately enabled in patch 2/4 for RISC-V.
>
> - Patch 3/4 improves model schedule to not increase register
> pressure in certain cases.
>
> - Patch 4/4 is just a debug hack which I would like any testers to
> apply as that helpe dme a lot during development of patch 3/4.
>
> More details can be found in individual patches.
>
> Results on RISC-V hardware BPI-F3 (perf stat instructions/cycles) and
> on aarch64 (I could only get QEMU dynamic icounts).
>
> RISC-V BPI-F3 (-Ofast -march=rv64gcv_zba_zbb_zbs)
>
> baseline | 7,631,707,552,979 cycles:u #
> 1.600 GHz
> | 2,630,225,489,010 instructions:u #
> 0.34 insn per cycle
> |
> all | 6,736,337,207,427 cycles:u (12% faster) #
> 1.600 GHz
> patches | 2,078,712,047,604 instructions:u (21% fewer) #
> 0.31 insn per cycle
>
> aarch64 (-Ofast -march=armv9-a+sve2) + implement
> TARGET_SCHED_PRESSURE_PREFER_NARROW=true
>
> baseline | 1,382,403,783,566
> |
> all | 1,113,896,471,282 (19.4% fewer)
> patches |
>
> As a follow up to discussions at Cauldron last month, I'm CC'ing some of
> the aarch64 and power folks to test this on real hardware and get the
> results (please don't forget to add equivalent of patch 2/4 for your
> respective backends, i.e.
>
> +#undef TARGET_SCHED_PRESSURE_PREFER_NARROW
> +#define TARGET_SCHED_PRESSURE_PREFER_NARROW hook_bool_void_true
>
> Thx,
> -Vineet
>
> Vineet Gupta (4):
> sched1: hookize pressure scheduling spilling agressiveness
> RISC-V: Implement TARGET_SCHED_PRESSURE_PREFER_NARROW [PR/114729]
> sched1: model: only promote true dependecies in predecessor promotion
> sched1: model: ICE on infinite loops in predecessor promotion (Not for
> Merge)
>
> gcc/config/riscv/riscv.cc | 3 +
> gcc/doc/tm.texi | 11 ++
> gcc/doc/tm.texi.in | 2 +
> gcc/haifa-sched.cc | 109 ++++++++++++++----
> gcc/sched-rgn.cc | 14 ++-
> gcc/target.def | 13 +++
> gcc/testsuite/gcc.target/riscv/riscv.exp | 2 +
> .../gcc.target/riscv/sched1-spills/hang1.c | 32 +++++
> .../gcc.target/riscv/sched1-spills/hang5.c | 60 ++++++++++
> .../gcc.target/riscv/sched1-spills/spill1.cpp | 31 +++++
> .../gcc.target/riscv/sched1-spills/spill2.cpp | 37 ++++++
> 11 files changed, 289 insertions(+), 25 deletions(-)
> create mode 100644 gcc/testsuite/gcc.target/riscv/sched1-spills/hang1.c
> create mode 100644 gcc/testsuite/gcc.target/riscv/sched1-spills/hang5.c
> create mode 100644 gcc/testsuite/gcc.target/riscv/sched1-spills/spill1.cpp
> create mode 100644 gcc/testsuite/gcc.target/riscv/sched1-spills/spill2.cpp
>
> --
> 2.43.0
>