Am 21.03.25 um 01:02 schrieb Jeff Law:
On 3/19/25 4:14 AM, Georg-Johann Lay wrote:
Am 16.03.25 um 14:51 schrieb Jeff Law via Gcc:
On 3/13/25 5:39 AM, Georg-Johann Lay via Gcc wrote:
There are situations where knowledge about which bits
of a value are (not) set can be used for optimization.
For example in an insn combine pattern like:
(define_insn_and_split ""
[(set (match_operand:QI 0 "register_operand" "=d")
(ior:QI (ashift:QI (match_operand:QI 1 "register_operand" "r")
(match_operand:QI 2
"const_0_to_7_operand" "n"))
(match_operand:QI 3 "register_operand" "0")))]
"optimize
&& !reload_completed
&& nonzero_bits (operands[1], VOIDmode) == 1"
...
This pattern is only correct when operands[1] is 0 or 1.
While such patterns seem to work, it's all quite wonky,
in particular since nonzero_bits() may forget about known
properties in later passes.
While it works most of the time, it's fundamentally wrong to have a
pattern where the conditional is dependent on state that changes
based on pass specific data, nearby context, etc.
For the use case I have in mind, it is in order when the
pattern works until split1 which would transform it into
something else (and without nonzero_bits() in the insn
condition, asserting that the existence of the pattern
certifies the bit condition).
It's still the wrong thing to do. You'll get away with it for a
while, but one day it'll break.
We have similar problems in the RISC-V world where we would like to
be able to match certain patterns based on known ranges of an
operand. The most common case would be bset/bclr/binv on an SImode
object on rv64 where the bit twiddled is variable. In particular we
need to know the bit position is not bit 31.
There's no way to really describe that in an insn's condition because
range information like that isn't available in RTL and something like
nonzero bits is pass specific.
As a result we're limited in our ability to use the bset/bclr/binv
instructions.
Jeff
One way to support this is a new target hook that would run somewhere
in recog_for_combine(). The hook would allow the backend to replace
the pattern as synthesized by combine with an equivalent pattern.
Much simpler: Add a split pass immediately after combine. Use
define_insn_and_split to handle rewriting. No hooks needed.
Jeff
Unfortunately, that doesn't work:
.../libgcc/config/avr/libf7/libf7.c: In function '__f7_get_float':
.../libgcc/config/avr/libf7/libf7.c:354:1: error: wrong amount of branch edges
after unconditional jump 18
354 | }
| ^
during RTL pass: avr-split-after-combine
.../libgcc/config/avr/libf7/libf7.c:354:1: internal compiler error:
verify_flow_info failed
0x1df2e91 internal_error(char const*, ...)
.../gcc/diagnostic-global-context.cc:517
0xa580fe verify_flow_info()
.../gcc/cfghooks.cc:287
0xf084c8 checking_verify_flow_info()
.../gcc/cfghooks.h:214
0xf084c8 split_all_insns()
.../gcc/recog.cc:3608
0xf084e8 execute
.../gcc/recog.cc:4507
This used a clone of pass_split_all_insns which runs
checking_verify_flow_info() at the end. passes.def reads:
NEXT_PASS (pass_combine);
NEXT_PASS (pass_late_combine);
NEXT_PASS (pass_if_after_combine);
NEXT_PASS (pass_jump_after_combine);
NEXT_PASS (pass_partition_blocks);
NEXT_PASS (pass_outof_cfg_layout_mode);
NEXT_PASS (pass_split_all_insns);
So just cloning pass_split_all_insns won't work.
But using split_all_insns_noflow() instead should
do the trick then?
Johann