On 1/12/25 7:49 AM, Xi Ruoyao wrote:
The test case
long
test (long x, long y)
{
return ((x | 0x1ff) << 3) + y;
}
is now compiled (-O2 -march=rv64g_zba) to
li a4,4096
slliw a5,a0,3
addi a4,a4,-8
or a5,a5,a4
addw a0,a5,a1
ret
Despite this check was originally intended to use zba better, now
removing it actually enables the use of zba for this test case (thanks
to late combine):
ori a5,a0,511
sh3add a0,a5,a1
ret
Obviously, bitmanip.md does not cover
(any_or (ashift (reg) (imm123)) imm) at all, and even for and it just
seems more natural splitting to (ashift (and (reg) (imm')) (imm123))
first, then let late combine to combine the outer ashift and the plus.
I've not found any test case regressed by the removal.
And "make check-gcc RUNTESTFLAGS=riscv.exp='zba-*.c'" also reports no
failure.
gcc/ChangeLog:
PR target/115921
* config/riscv/riscv.md (<optab>_shift_reverse): Remove
check for TARGET_ZBA.
gcc/testsuite/ChangeLog:
PR target/115921
* gcc.target/riscv/zba-shNadd-08.c: New test.
So I've gone back and tried to jog my memory about what we were trying
avoid with the zba condition, including looking through tests in our
internal tree that hadn't been upstreamed yet (there aren't many and
those that exist are likely related to internal patches that aren't
ready to upstream yet or things that are upstream, but under a different
name).
Unfortunately I couldn't find either a test or context to jog my memory.
The "andi.add.uw" pattern in bitmanip.md or this splitter look to be
the most relevant:
;; During combine, we may encounter an attempt to combine
;; slli rtmp, rs, #imm
;; zext.w rtmp, rtmp
;; sh[123]add rd, rtmp, rs2
;; which will lead to the immediate not satisfying the above constraints.
;; By splitting the compound expression, we can simplify to a slli and a
;; sh[123]add.uw.
(define_split
[(set (match_operand:DI 0 "register_operand")
(plus:DI (and:DI (ashift:DI (match_operand:DI 1 "register_operand")
(match_operand:QI 2 "immediate_operand"))
(match_operand:DI 3 "consecutive_bits_operand"))
(match_operand:DI 4 "register_operand")))
(clobber (match_operand:DI 5 "register_operand"))]
"TARGET_64BIT && TARGET_ZBA"
[(set (match_dup 5) (ashift:DI (match_dup 1) (match_dup 6)))
(set (match_dup 0) (plus:DI (and:DI (ashift:DI (match_dup 5)
(match_dup 7))
(match_dup 8))
(match_dup 4)))]
The other reasonable possibility would be:
; Make sure that an andi followed by a sh[123]add remains a two instruction
; sequence--and is not torn apart into slli, slri, add.
(define_insn_and_split "*andi_add.uw"
[(set (match_operand:DI 0 "register_operand" "=r")
(plus:DI (and:DI (ashift:DI (match_operand:DI 1 "register_operand" "r")
(match_operand:QI 2 "imm123_operand" "Ds3"))
(match_operand:DI 3 "consecutive_bits_operand" ""))
(match_operand:DI 4 "register_operand" "r")))
(clobber (match_scratch:DI 5 "=&r"))]
"TARGET_64BIT && TARGET_ZBA
&& riscv_shamt_matches_mask_p (INTVAL (operands[2]), INTVAL (operands[3]))
&& SMALL_OPERAND (INTVAL (operands[3]) >> INTVAL (operands[2]))"
But without a testcase it's obviously hard to be sure either way. But I
do know that condition was added for a reason.
Regardless, I don't think that without a testcase that we should hold up
this patch. It's entirely possible that late-combine cleanings things
up now and we don't need the special casing anymore. Or it may turn out
that we have to revisit (and get a testcase installed if so!).
The net is I'll commit this momentarily.
Thanks,
Jeff