On 12/7/24 11:33 AM, Hans-Peter Nilsson wrote:
On Sat, 30 Nov 2024, Jeff Law wrote:
On 11/28/24 5:26 AM, Alexey Merzlyakov wrote:
This patch adds optimization of the following patterns:
(zero_extend:M (subreg:N (not:O==M (X:Q==M)))) ->
(xor:M (zero_extend:M (subreg:N (X:M)), mask))
... where the mask is GET_MODE_MASK (N).
For the cases when X:M doesn't have any non-zero bits outside of mode N,
(zero_extend:M (subreg:N (X:M)) could be simplified to just (X:M)
and whole optimization will be:
(zero_extend:M (subreg:N (not:M (X:M)))) ->
(xor:M (X:M, mask))
Patch targets to handle code patterns like:
not a0,a0
andi a0,a0,0xff
to be optimized to:
xori a0,a0,255
Thanks. I've bootstrapped & regression tested this on x86_64 as well as run
it through my tester successfully. Pushed to the trunk. Hopefully no fallout
this time :-)
jeff
This doesn't look like an obviously generic, universal
optimization.
Targets that have to move a constant to a register for the xor
mask, but have simple access to sub-word "not" and zero-extend,
will be pessimized by having the "not" and zero-extend replaced
with an xor with a constant, IIUC.
This isn't any different than targets which can't efficiently handle the
0xffff case. Such targets should reject the xor with a constant if they
can't handle it which will keep combine from utilizing the xor with
constant sequence. I've already seen that in practice.
For targets that have xor with constant, but for which it is more
expensive than the original sequence, if they have their costing right,
the right thing should just happen as well, though I haven't personally
seen that behavior.
Jeff