Committed 2 weeks ago but apparently I didn't send mail to say that, thanks Vineet.
On Thu, Mar 2, 2023 at 3:56 AM Philipp Tomsich <philipp.toms...@vrull.eu> wrote: > > On Wed, 1 Mar 2023 at 20:53, Vineet Gupta <vine...@rivosinc.com> wrote: > > > > This showed up as dynamic icount regression in SPEC 531.deepsjeng with > > upstream > > gcc (vs. gcc 12.2). gcc was resorting to synthetic multiply using > > shift+add(s) > > even when multiply had clear cost benefit. > > > > |00000000000133b8 <see(state_t*, int, int, int, int) [clone > > .constprop.0]+0x382>: > > | 133b8: srl a3,a1,s6 > > | 133bc: and a3,a3,s5 > > | 133c0: slli a4,a3,0x9 > > | 133c4: add a4,a4,a3 > > | 133c6: slli a4,a4,0x9 > > | 133c8: add a4,a4,a3 > > | 133ca: slli a3,a4,0x1b > > | 133ce: add a4,a4,a3 > > > > vs. gcc 12 doing something lke below. > > > > |00000000000131c4 <see(state_t*, int, int, int, int) [clone > > .constprop.0]+0x35c>: > > | 131c4: ld s1,8(sp) > > | 131c6: srl a3,a1,s4 > > | 131ca: and a3,a3,s11 > > | 131ce: mul a3,a3,s1 > > > > Bisected this to f90cb39235c4 ("RISC-V: costs: support shift-and-add in > > strength-reduction"). The intent was to optimize cost for > > shift-add-pow2-{1,2,3} corresponding to bitmanip insns SH*ADD, but ended > > up doing that for all shift values which seems to favor synthezing > > multiply among others. > > > > The bug itself is trivial, IN_RANGE() calling pow2p_hwi() which returns bool > > vs. exact_log2() returning power of 2. > > > > This fix also requires update to the test introduced by the same commit > > which now generates MUL vs. synthesizing it. > > > > gcc/Changelog: > > > > * config/riscv/riscv.cc (riscv_rtx_costs): Fixed IN_RANGE() to > > use exact_log2(). > > > > gcc/testsuite/ChangeLog: > > > > * gcc.target/riscv/zba-shNadd-07.c: f2(i*783) now generates MUL vs. > > 5 insn sh1add+slli+add+slli+sub. > > * gcc.target/riscv/pr108987.c: New test. > > > > Signed-off-by: Vineet Gupta <vine...@rivosinc.com> > > Reviewed-by: Philipp Tomsich <philipp.toms...@vrull.eu>