[Bug middle-end/50168] __builtin_ctz() and intrinsics bsr(), bsf() generate suboptimal code on x86_64

jakub at gcc dot gnu.org Wed, 24 Aug 2011 02:47:14 -0700

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50168


--- Comment #8 from Jakub Jelinek <jakub at gcc dot gnu.org> 2011-08-24 
09:45:56 UTC ---
What we IMHO should optimize and don't currently is the redundant sign
extension when using __builtin_ffsl - as it internally uses bsf + cmove,
nonzero_bits isn't able to figure out that the result of the sequence is
guaranteed to have nonzero-bits.  Perhaps we should in that case add a
REG_EQUAL note to the last insn in the sequence and perhaps nonzero_bits could
also look at REG_EQUAL notes.  Doing that could perhaps help even testcases
like:
int foo (long x)
{
  return __builtin_popcountl (x) & 0xff;
}
where the andl $0x255, %eax could be optimized away, etc.

[Bug middle-end/50168] __builtin_ctz() and intrinsics __bsr(), __bsf() generate suboptimal code on x86_64

Reply via email to

[Bug middle-end/50168] __builtin_ctz() and intrinsics bsr(), bsf() generate suboptimal code on x86_64