On 1/19/24 09:05, Georg-Johann Lay wrote:
Am 18.01.24 um 20:54 schrieb Roger Sayle:
This patch tweaks RTL expansion of multi-word shifts and rotates to use
PLUS rather than IOR for disjunctive operations. During expansion of
these operations, the middle-end creates RTL like (X<<C1) | (Y>>C2)
where the constants C1 and C2 guarantee that bits don't overlap.
Hence the IOR can be performed by any any_or_plus operation, such as
IOR, XOR or PLUS; for word-size operations where carry chains aren't
an issue these should all be equally fast (single-cycle) instructions.
The benefit of this change is that targets with shift-and-add insns,
like x86's lea, can benefit from the LSHIFT-ADD form.
An example of a backend that benefits is ARC, which is demonstrated
by these two simple functions:
But there are also back-ends where this is bad.
The reason is that with ORI, the back-end needs only to operate no
these sub-words where the sub-mask is non-zero. But for PLUS this
is not the case because the back-end does not know that intermediate
carry will be zero. Hence, with PLUS, more instructions are needed.
An example is AVR, but maybe much more target with multi-word operations
are affected in a bad way.
Take for example the case with 2 words and a value of 1.
LO |= 1
HI |= 0
can be optimized to
LO |= 1
but for addition this is not the case:
LO += 1
HI +=c 0 ;; Does not know that always carry = 0.
I think it's clear that the decision is target and possibly uarch
specific within a target.
Which means that expmed is probably the right place and that we're going
to need to look for a good way for the target to control. I suspect
rtx_cost isn't likely a good fit.
Jeff