On 12/7/24 11:33 AM, Hans-Peter Nilsson wrote:
On Sat, 30 Nov 2024, Jeff Law wrote:



On 11/28/24 5:26 AM, Alexey Merzlyakov wrote:
This patch adds optimization of the following patterns:

    (zero_extend:M (subreg:N (not:O==M (X:Q==M)))) ->
    (xor:M (zero_extend:M (subreg:N (X:M)), mask))
    ... where the mask is GET_MODE_MASK (N).

For the cases when X:M doesn't have any non-zero bits outside of mode N,
(zero_extend:M (subreg:N (X:M)) could be simplified to just (X:M)
and whole optimization will be:

    (zero_extend:M (subreg:N (not:M (X:M)))) ->
    (xor:M (X:M, mask))

Patch targets to handle code patterns like:
    not   a0,a0
    andi  a0,a0,0xff
to be optimized to:
    xori  a0,a0,255

Thanks.  I've bootstrapped & regression tested this on x86_64 as well as run
it through my tester successfully.  Pushed to the trunk.  Hopefully no fallout
this time :-)

jeff



This doesn't look like an obviously generic, universal
optimization.

Targets that have to move a constant to a register for the xor
mask, but have simple access to sub-word "not" and zero-extend,
will be pessimized by having the "not" and zero-extend replaced
with an xor with a constant, IIUC.
This isn't any different than targets which can't efficiently handle the 0xffff case. Such targets should reject the xor with a constant if they can't handle it which will keep combine from utilizing the xor with constant sequence. I've already seen that in practice.

For targets that have xor with constant, but for which it is more expensive than the original sequence, if they have their costing right, the right thing should just happen as well, though I haven't personally seen that behavior.

Jeff

Reply via email to