On Jun 19, 4:27 pm, godavemon <[EMAIL PROTECTED]> wrote: > I need to calculate the Hamming Distance of two integers. The hamming > distance is the number of bits in two integers that don't match. I > thought there'd be a function in math or scipy but i haven't been able > to find one. This is my function but it seems like there should be a > faster way. I do this computation many times and speed up is > important. > > def hamdist( a, b , bits = 32): > def _hamdist( x, bits): > if bits: > return (x & 1) + _hamdist(x >> 1, bits-1) > return x & 1 > return _hamdist( a ^ b, bits) > > Another alternative would be to convert the XOR to a binary string and > count the # of 1's. > > Which would be fastest? Are there better alternatives? > > Thanks!
I see no good reason to use recursion for this type of thing. Here are some of my attempts: [code] from math import log def yours(a, b , bits = 32): def _hamdist( x, bits): if bits: return (x & 1) + _hamdist(x >> 1, bits-1) return x & 1 return _hamdist(a ^ b, bits) def simple(a, b, bits=32): x = a ^ b return sum((x >> i & 1) for i in xrange(bits)) def lazy(a, b, bits=32): x = (a ^ b) & ((1 << bits) - 1) tot = 0 while x: tot += x & 1 x >>= 1 return tot def fancy(a, b, bits=32): x = (a ^ b) & ((1 << bits) - 1) tot = 0 while x: tot += 1 x ^= 1 << int(log(x, 2)) return tot test_vals = ( ((0xffffffff, 0), 32), ((0,0), 0), ((1,0), 1), ((0x80000000, 0), 1), ((0x55555555, 0), 16) ) def test(f): test_vals = ( ((0xffffffff, 0), 32), # ALL ((0,0), 0), # None ((1,0), 1), # First ((0x80000000, 0), 1), # Last ((0x55555555, 0), 16), # Every Other ((0xffff, 0), 16), # First Half ((0xffff0000, 0), 16), # Last Half ) for i, (args, exp) in enumerate(test_vals): if f(*args) != exp: return 0 return 1 if __name__ == "__main__": for f in (yours, simple, lazy, fancy): if not test(f): print "%s failed"%f.__name__ [/code] The python module `timeit` is handy for testing speed: python -mtimeit -s"from hamdist import *" "test(yours)" 10000 loops, best of 3: 95.1 usec per loop python -mtimeit -s"from hamdist import *" "test(simple)" 10000 loops, best of 3: 65.3 usec per loop python -mtimeit -s"from hamdist import *" "test(lazy)" 10000 loops, best of 3: 59.8 usec per loop python -mtimeit -s"from hamdist import *" "test(fancy)" 10000 loops, best of 3: 77.2 usec per loop Even the ridiculous `fancy` version beat the recursive version. Matt -- http://mail.python.org/mailman/listinfo/python-list