* I would like to do the same for __builtin_ctz, but there is a catch.
The synthetic ctz sequence in terms of popcount (as presently
implemented by ia64.md, and potentially usable for at least i386 and
rs6000 as well if moved to optabs.c) produces the canonical behavior at
zero, but the synthetic sequence in terms of clz (as presently
implemented by optabs.c) produces the value -1 at zero. I have not
been
able to think of any refinement to that sequence that would reliably
produce GET_MODE_BITSIZE(mode) at zero in an efficient manner.
Furthermore, -1 is the value most convenient for implementing ffs in
terms of ctz. Opinions and/or clever bit manipulation hacks would be
much appreciated.
I suppose you're using (assuming 32-bit)
ctz(x) := 31 - clz(x & -x)
now, which gives -1 for 0; and the version you're looking for is
ctz(x) := 32 - clz(~x & (x-1))
which gives 32 for 0.
(Straight from the venerable PowerPC Compiler Writer's Guide, btw).
What does the popcount version look like? Never seen that before,
but I think it will be really expensive on PowerPC.
Segher