On 5/28/25 9:05 PM, Jiawei wrote:
This seems like it would be much better as a combine pattern. In
fact, I'm a bit surprised that combine didn't simplify this series of
operations into a IOR. So I'd really like to see the .combine dump
with and without this hunk for the relevant testcase.
Here is the dump log, using
trunk(7fca794e0199baff8f07140a950ba3374c6aa634), more details please see
https://godbolt.org/z/3hfzdz3Ks
===========================================================================================
~/rv/bin/riscv64-unknown-linux-gnu-g++ -march=rv64gc_zba_zbb_zbs -O2 -S
-fdump-rtl-all redundant-bitmap-2.C
before combine in .ext_dce
Thanks! That was helpful.
I was looking for the full .combine dump -- the full dump includes
information about patterns that were tried and failed. That often will
point the way to a better solution.
In particular in the .combine dump we have this nugget:
Trying 15 -> 18:
15: r151:DI=0xfffffffffffffffe<-<r156:DI#0&r149:DI
REG_DEAD r149:DI
18: r154:DI=0x1<<r156:DI#0^r151:DI
REG_DEAD r156:DI
REG_DEAD r151:DI
Failed to match this instruction:
(set (reg:DI 154)
(xor:DI (and:DI (rotate:DI (const_int -2 [0xfffffffffffffffe])
(subreg:QI (reg:DI 156 [ b ]) 0))
(reg:DI 149 [ *a_10(D) ]))
(ashift:DI (const_int 1 [0x1])
(subreg:QI (reg:DI 156 [ b ]) 0))))
I think combine really should have simplified that before querying the
target. That really should have been simpified to a bit insertion idiom
or perhaps an simpler ior.
More generally, the question we should first ask is whether or not the
source should have simplified independent of the target. I think the
answer is yes in this case, which means we should try to fix that
problem first since it'll improve every target rather than just RISC-V.
When we do find ourselves needing to write new target patterns, a
define_insn will generally be preferable to a define_peephole.
The define_insn will match when there's a data dependency within a basic
block. A define_peephole requires the insns to be consecutive in the
IL. Thus the define_insn will tend to match more often and is those
preferable to a define_peephole.
Anyway, to recap, I think the better solution is to improve
simplify_binary_operation or one of its children or perhaps
simplify_compound_operation its related functions.
jeff