The RTL pattern has an "and" operation, which clears out the upper bits after the shift operation. Since we have (INTVAL (operands[3]) >> INTVAL (operands[2])) == 0xffffffff as a constraint, the RTL template and the split code should be semantically identical. Also, the RTL template here is technically the same as that of "*slliuw" in bitmanip.md, and the split code shows the semantics of an slli.uw operation.
Bohan ------------------------------------------------------------------ From:Jeff Law <jeffreya...@gmail.com> Send Time:2025 Mar. 30 (Sun.) 23:38 To:Bohan Lei<garth...@linux.alibaba.com>; "gcc-patches"<gcc-patches@gcc.gnu.org> CC:"christoph.muellner"<christoph.muell...@vrull.eu> Subject:Re: [PATCH] RISC-V: xtheadmemidx: Split slli.uw pattern On 3/23/25 8:43 PM, Bohan Lei wrote: > The combine pass can generate an index like (and:DI (mult:DI (reg:DI) > (const_int scale)) (const_int mask)) when XTheadMemIdx is available. > LRA may pull it out, and thus a splitter is needed when Zba is not > available. > > A similar splitter were introduced when XTheadMemIdx support was added, > but removed in commit 31c3c5d. The new splitter in this new patch is > based on the removed one. > > gcc/ChangeLog: > > * config/riscv/thead.md (*th_memidx_operand): New splitter. > > gcc/testsuite/ChangeLog: > > * gcc.target/riscv/xtheadmemidx-bug.c: New test. Sorry, this doesn't look correct to me. > +(define_insn_and_split "*th_memidx_operand" > + [(set (match_operand:DI 0 "register_operand" "=r") > + (and:DI (ashift:DI (match_operand:DI 1 "register_operand" "r") > + (match_operand:QI 2 "imm123_operand" "Ds3")) > + (match_operand 3 "const_int_operand" "n")))] > + "TARGET_64BIT && TARGET_XTHEADMEMIDX && (lra_in_progress || > reload_completed) > + && (INTVAL (operands[3]) >> INTVAL (operands[2])) == 0xffffffff" > + "#" > + "&& !TARGET_ZBA && reload_completed" > + [(set (match_dup 0) (zero_extend:DI (subreg:SI (match_dup 1) 0))) > + (set (match_dup 0) (ashift:DI (match_dup 0) (match_dup 2)))] The RTL pattern matches a DImode register. But the split code zeros out the upper 32 bits. ie, there is a semantic mismatch between the behavior of the original RTL pattern and the split RTL code. That can't be correct and is a strong indicator that the real problem is elsewhere. Jeff