On 04/30/2010 12:51 PM, Helmut Jarausch wrote:
I think one could apply an external hashing technique which would require only very few disk accesses per lookup. Unfortunately, I'm now aware of an implementation in Python. Does anybody know about a Python implementation of external hashing?
While you don't detail what you're hashing, Stephan Behnel already suggested (in the parent thread) using one of Python's native dbm modules (I just use anydbm and let it choose). The underlying implementations should be fairly efficient assuming you don't use the dumbdbm last-resort fallback). With the anydbm interface, you can implement dict/set semantics as long as you take care that everything is marshalled into and out of strings for keys/values.
-tkc -- http://mail.python.org/mailman/listinfo/python-list