Roy> If you're getting long hash chains, you're either using a bad hash Roy> function, or your table isn't big enough.
Sure. The original poster said something about 10 million keys I think. Unless the key distribution is amazingly fortuitous and the number of unique values is small, the dictionary is going to have a huge memory footprint. On my Mac laptop this code: >>> d = {} >>> for n in xrange(10**7): ... d[n] = hex(n) ... yields a process with 647MB of VM. If I trim that to 1000 unique values: >>> d = {} >>> for n in xrange(10**7): ... d[n] = hex(n % 1000) ... I still wind up with 647MB VM. The dictionary and its keys are what consume all the memory, and those keys are as simple as you can get. Skip -- http://mail.python.org/mailman/listinfo/python-list