On 1/12/25 7:49 AM, Xi Ruoyao wrote:
The test case

     long
     test (long x, long y)
     {
       return ((x | 0x1ff) << 3) + y;
     }

is now compiled (-O2 -march=rv64g_zba) to

     li      a4,4096
     slliw   a5,a0,3
     addi    a4,a4,-8
     or      a5,a5,a4
     addw    a0,a5,a1
     ret

Despite this check was originally intended to use zba better, now
removing it actually enables the use of zba for this test case (thanks
to late combine):

     ori    a5,a0,511
     sh3add  a0,a5,a1
     ret

Obviously, bitmanip.md does not cover
(any_or (ashift (reg) (imm123)) imm) at all, and even for and it just
seems more natural splitting to (ashift (and (reg) (imm')) (imm123))
first, then let late combine to combine the outer ashift and the plus.

I've not found any test case regressed by the removal.
And "make check-gcc RUNTESTFLAGS=riscv.exp='zba-*.c'" also reports no
failure.

gcc/ChangeLog:

        PR target/115921
        * config/riscv/riscv.md (<optab>_shift_reverse): Remove
        check for TARGET_ZBA.

gcc/testsuite/ChangeLog:

        PR target/115921
        * gcc.target/riscv/zba-shNadd-08.c: New test.
So I've gone back and tried to jog my memory about what we were trying avoid with the zba condition, including looking through tests in our internal tree that hadn't been upstreamed yet (there aren't many and those that exist are likely related to internal patches that aren't ready to upstream yet or things that are upstream, but under a different name).

Unfortunately I couldn't find either a test or context to jog my memory. The "andi.add.uw" pattern in bitmanip.md or this splitter look to be the most relevant:

;; During combine, we may encounter an attempt to combine
;;   slli rtmp, rs, #imm
;;   zext.w rtmp, rtmp
;;   sh[123]add rd, rtmp, rs2
;; which will lead to the immediate not satisfying the above constraints.
;; By splitting the compound expression, we can simplify to a slli and a
;; sh[123]add.uw.
(define_split
  [(set (match_operand:DI 0 "register_operand")
        (plus:DI (and:DI (ashift:DI (match_operand:DI 1 "register_operand")
                                    (match_operand:QI 2 "immediate_operand"))
                         (match_operand:DI 3 "consecutive_bits_operand"))
                 (match_operand:DI 4 "register_operand")))
   (clobber (match_operand:DI 5 "register_operand"))]
  "TARGET_64BIT && TARGET_ZBA"
[(set (match_dup 5) (ashift:DI (match_dup 1) (match_dup 6))) (set (match_dup 0) (plus:DI (and:DI (ashift:DI (match_dup 5)
                                                  (match_dup 7))
                                       (match_dup 8))
                               (match_dup 4)))]

The other reasonable possibility would be:



; Make sure that an andi followed by a sh[123]add remains a two instruction
; sequence--and is not torn apart into slli, slri, add.
(define_insn_and_split "*andi_add.uw"
  [(set (match_operand:DI 0 "register_operand" "=r")
        (plus:DI (and:DI (ashift:DI (match_operand:DI 1 "register_operand" "r")
                                    (match_operand:QI 2 "imm123_operand" "Ds3"))
                         (match_operand:DI 3 "consecutive_bits_operand" ""))
                 (match_operand:DI 4 "register_operand" "r")))
   (clobber (match_scratch:DI 5 "=&r"))]
  "TARGET_64BIT && TARGET_ZBA
   && riscv_shamt_matches_mask_p (INTVAL (operands[2]), INTVAL (operands[3]))
   && SMALL_OPERAND (INTVAL (operands[3]) >> INTVAL (operands[2]))"



But without a testcase it's obviously hard to be sure either way. But I do know that condition was added for a reason.

Regardless, I don't think that without a testcase that we should hold up this patch. It's entirely possible that late-combine cleanings things up now and we don't need the special casing anymore. Or it may turn out that we have to revisit (and get a testcase installed if so!).

The net is I'll commit this momentarily.

Thanks,
Jeff




Reply via email to