On Thu, Jan 18, 2024 at 8:55 PM Roger Sayle <ro...@nextmovesoftware.com> wrote: > > > This patch tweaks RTL expansion of multi-word shifts and rotates to use > PLUS rather than IOR for disjunctive operations. During expansion of > these operations, the middle-end creates RTL like (X<<C1) | (Y>>C2) > where the constants C1 and C2 guarantee that bits don't overlap. > Hence the IOR can be performed by any any_or_plus operation, such as > IOR, XOR or PLUS; for word-size operations where carry chains aren't > an issue these should all be equally fast (single-cycle) instructions. > The benefit of this change is that targets with shift-and-add insns, > like x86's lea, can benefit from the LSHIFT-ADD form. > > An example of a backend that benefits is ARC, which is demonstrated > by these two simple functions: > > unsigned long long foo(unsigned long long x) { return x<<2; } > > which with -O2 is currently compiled to: > > foo: lsr r2,r0,30 > asl_s r1,r1,2 > asl_s r0,r0,2 > j_s.d [blink] > or_s r1,r1,r2 > > with this patch becomes: > > foo: lsr r2,r0,30 > add2 r1,r2,r1 > j_s.d [blink] > asl_s r0,r0,2 > > unsigned long long bar(unsigned long long x) { return (x<<2)|(x>>62); } > > which with -O2 is currently compiled to 6 insns + return: > > bar: lsr r12,r0,30 > asl_s r3,r1,2 > asl_s r0,r0,2 > lsr_s r1,r1,30 > or_s r0,r0,r1 > j_s.d [blink] > or r1,r12,r3 > > with this patch becomes 4 insns + return: > > bar: lsr r3,r1,30 > lsr r2,r0,30 > add2 r1,r2,r1 > j_s.d [blink] > add2 r0,r3,r0 > > > This patch has been tested on x86_64-pc-linux-gnu with make bootstrap > and make -k check, both with and without --target_board=unix{-m32} > with no new failures. Ok for mainline?
For expand_shift_1 you add + where C is the bitsize of A. If N cannot be zero, + use PLUS instead of IOR. but I don't see a check ensuring this other than mabe CONST_INT_P (op1) suggesting that we enver end up with const0_rtx here. OTOH why is N zero a problem and why is it not in the optabs.cc case where I don't see any such check (at least not obvious)? Since this doesn't seem to fix a regression it probably has to wait for stage1 to re-open. Thanks, Richard. > > 2024-01-18 Roger Sayle <ro...@nextmovesoftware.com> > > gcc/ChangeLog > * expmed.cc (expand_shift_1): Use add_optab instead of ior_optab > to generate PLUS instead or IOR when unioning disjoint bitfields. > * optabs.cc (expand_subword_shift): Likewise. > (expand_binop): Likewise for double-word rotate. > > > Thanks in advance, > Roger > -- >