Hi Roger, Do you want to say bmsk_s instead of msk_s here: +/* { dg-final { scan-assembler "msk_s\\s+r0,r0,0" } } */
Anyhow, the patch looks good. Proceed with your commit. Thank you, Claudiu On Mon, Oct 30, 2023 at 5:05 AM Jeff Law <jeffreya...@gmail.com> wrote: > > > > On 10/28/23 10:47, Roger Sayle wrote: > > > > This patch optimizes PR middle-end/101955 for the ARC backend. On ARC > > CPUs with a barrel shifter, using two shifts is (probably) optimal as: > > > > asl_s r0,r0,31 > > asr_s r0,r0,31 > > > > but without a barrel shifter, GCC -O2 -mcpu=em currently generates: > > > > and r2,r0,1 > > ror r2,r2 > > add.f 0,r2,r2 > > sbc r0,r0,r0 > > > > with this patch, we now generate the smaller, faster and non-flags > > clobbering: > > > > bmsk_s r0,r0,0 > > neg_s r0,r0 > > > > Tested with a cross-compiler to arc-linux hosted on x86_64, > > with no new (compile-only) regressions from make -k check. > > Ok for mainline if this passes Claudiu's nightly testing? > > > > > > 2023-10-28 Roger Sayle <ro...@nextmovesoftware.com> > > > > gcc/ChangeLog > > PR middle-end/101955 > > * config/arc/arc.md (*extvsi_1_0): New define_insn_and_split > > to convert sign extract of the least significant bit into an > > AND $1 then a NEG when !TARGET_BARREL_SHIFTER. > > > > gcc/testsuite/ChangeLog > > PR middle-end/101955 > > * gcc.target/arc/pr101955.c: New test case. > Good catch. Looking to do something very similar on the H8 based on > your work here. > > One the H8 we can use bld to load a bit from an 8 bit register into the > C flag. Then we use subtract with carry to get an 8 bit 0/-1 which we > can then sign extend to 16 or 32 bits. That covers bit positions 0..15 > of an SImode input. > > For bits 16..31 we can move the high half into the low half, the use the > bld sequence. > > For bit zero the and+neg is the same number of clocks and size as bld > based sequence. But it'll simulate faster, so it's special cased. > > > Jeff >