On 5/25/23 06:35, Manolis Tsamis wrote:
Implementation of the new RISC-V optimization pass for memory offset calculations, documentation and testcases. gcc/ChangeLog: * config.gcc: Add riscv-fold-mem-offsets.o to extra_objs. * config/riscv/riscv-passes.def (INSERT_PASS_AFTER): Schedule a new pass. * config/riscv/riscv-protos.h (make_pass_fold_mem_offsets): Declare. * config/riscv/riscv.opt: New options. * config/riscv/t-riscv: New build rule. * doc/invoke.texi: Document new option. * config/riscv/riscv-fold-mem-offsets.cc: New file. gcc/testsuite/ChangeLog: * gcc.target/riscv/fold-mem-offsets-1.c: New test. * gcc.target/riscv/fold-mem-offsets-2.c: New test. * gcc.target/riscv/fold-mem-offsets-3.c: New test.
So I made a small number of changes so that this could be run on other targets.
I had an hppa compiler handy, so it was trivial to do some light testing with that. f-m-o didn't help at all on the included tests. But I think that's more likely an artifact of the port supporting scaled indexed loads and doing fairly aggressive address rewriting to encourage that addressing mode.
Next I had an H8 compiler handy. All three included tests showed improvement, both in terms of instruction count and size. What was most interesting here is that f-m-o removed some redundant address calculations without needing to adjust the memory references which was a pleasant surprise.
Given the fact that both ports worked and the H8 showed an improvement, the next step was to put the patch into my tester. It tests 30+ distinct processor families. The goal wasn't to evaluate effectiveness, but to validate that those targets could still build their target libraries and successfully run their testsuites.
That's run through the various crosses. Things like the hppa, alpha, m68k bootstraps only run once a week as they take many hours each. The result is quite encouraging. None of the crosses had any build issues or regressions.
The net result I think is we should probably move this to a target independent optimization pass. We only need to generalize a few things.
Most importantly we need to get a resolution on the conditional I asked about inside get_single_def_in_bb. There's some other refactoring I think we should do, but I'd really like to get a resolution on the code in get_single_def_in_bb first, then we ought to be able to move forward pretty quickly on the refactoring and integration.
jeff