On Tue, Jun 15, 2021 at 5:17 PM Roger Sayle <ro...@nextmovesoftware.com> wrote: > > > This patch tackles PR46235 to improve the code generated for bit tests > on x86_64 by making more use of the bt instruction. Currently, GCC emits > bt instructions when followed by condition jumps (thanks to Uros' > splitters). > This patch adds splitters in i386.md, to catch the cases where bt is > followed > by a conditional move (as in the original report), or by a setc/setnc (as in > comment 5 of the Bugzilla PR). > > With this patch, the motivating function in the original PR > > int foo(int a, int x, int y) { > if (a & (1 << x)) > return a; > return 1; > } > > which with -O2 on mainline generates: > > foo: movl %edi, %eax > movl %esi, %ecx > sarl %cl, %eax > testb $1, %al > movl $1, %eax > cmovne %edi, %eax > ret > > now generates: > foo: btl %esi, %edi > movl $1, %eax > cmovc %edi, %eax > ret > > Likewise, IsBitSet1 (from comment 5) > > bool IsBitSet1(unsigned char byte, int index) { > return (byte & (1<<index)) != 0; > } > > Before: > movzbl %dil, %eax > movl %esi, %ecx > sarl %cl, %eax > andl $1, %eax > ret > > After: > movzbl %dil, %edi > btl %esi, %edi > setc %al > ret > > [Identical code is generated for comment 5's IsBitSet2] > bool IsBitSet2(unsigned char byte, int index) { > return (byte >> index) & 1; > } > > And finally to demonstrate the corner cases also handled, > > int IsBitClr(long long dword, int index) { > return (dword & (1LL<<index)) == 0; > } > > Before: > movq %rdi, %rax > movl %esi, %ecx > sarq %cl, %rax > notq %rax > andl $1, %eax > ret > > After: > xorl %eax, %eax > btq %rsi, %rdi > setnc %al > ret > > According to Agner Fog, SAR/SHR r,cl takes 2 cycles on skylake, > where BT r,r takes only one, so the performance improvements on > recent hardware may be more significant than implied by just the > reduced number of instructions. I've avoided transforming cases > (such as btsi_setcsi) where using bt sequences may not be a clear > win (over sarq/andl). > > This patch has been tested on x86_64-pc-linux-gnu with a "make > bootstrap" and "make -k check" with no new failures. > > Ok for mainline? > > 2010-06-15 Roger Sayle <ro...@nextmovesoftware.com> > > gcc/ChangeLog > PR rtl-optimization/46235 > * config/i386/i386.md: New define_split for bt followed by cmov. > (*bt<mode>_setcqi): New define_insn_and_split for bt followed by > setc. > (*bt<mode>_setncqi): New define_insn_and_split for bt then setnc. > (*bt<mode>_setnc<mode>): New define_insn_and_split for bt followed > by setnc with zero extension. > > gcc/testsuite/ChangeLog > PR rtl-optimization/46235 > * gcc.target/i386/bt-5.c: New test. > * gcc.target/i386/bt-6.c: New test. > * gcc.target/i386/bt-7.c: New test.
OK. Thanks, Uros.