Hi, Not sure if this can be of any interest, but I noticed the ffs function is implemented using a loop.
A faster implementation exists using the "pcount" assembler instruction (where available and fast - this does not seem to be the case on all processors), or through another algorithm using only a few instructions. Check out: http://supertech.csail.mit.edu/papers/debruijn.pdf For a 32-bit word, the optimized function could be written this way: int ffs(int field) { static const int index[] = { 1, 2, 29, 3, 30, 15, 25, 4, 31, 23, 21, 16, 26, 18, 5, 9, 32, 28, 14, 24, 22, 20, 17, 8, 27, 13, 19, 7, 12, 6, 11, 10 }; unsigned int w = field; if (w == 0) return (0); w &= -w; w *= 125613361U; w >>= 27; return index[w]; } Regards, Olivier. This message posted from opensolaris.org _______________________________________________ perf-discuss mailing list perf-discuss@opensolaris.org