On Fri, Dec 06, 2013 at 12:02:59PM +0800, Alexander Wu wrote: > Some additional test for bitmap performance (10 million loops): > > suse-kvm-of13:/test # time ./bit3 > > real 0m0.061s > user 0m0.060s > sys 0m0.000s > suse-kvm-of13:/test # time ./bit2 > > real 0m0.088s > user 0m0.084s > sys 0m0.000s > suse-kvm-of13:/test # time ./bit1 > > real 0m0.080s > user 0m0.072s > sys 0m0.004s > > [bit1]: count1_bits > [bit2]: NumberOfSetBits from http://stackoverflow.com/a/109025 > [bit3]: precomputed16_bitcount from > http://gurmeet.net/puzzles/fast-bit-counting-routines/bitcount.c > > After reading the implements, I understand how precomputed16_bitcount wins. > count_1bits_32 inits 256 bytes array, while precomputed16_bitcount > inits 65536 bytes array and reduce the calculation: > > count1_bits: > return (count_1bits_8[x & 0xff] + > count_1bits_8[(x >> 8) & 0xff] + > count_1bits_8[(x >> 16) & 0xff] + > count_1bits_8[x >> 24]); > > precomputed16_bitcount: > return bits_in_16bits [n & 0xffffu] > + bits_in_16bits [(n >> 16) & 0xffffu] ; > > So how about change to the fastest one?
Please talk to Jarno, who has been looking at a related change. But I'm inclined to believe that a 65536-byte array wastes too much memory. _______________________________________________ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev