"Claudio Grondi" <[EMAIL PROTECTED]> wrote in message news:[EMAIL PROTECTED] > Chris Foote wrote: > > Hi all. > > > > I have the need to store a large (10M) number of keys in a hash table, > > based on a tuple of (long_integer, integer). The standard python > > dictionary works well for small numbers of keys, but starts to > > perform badly for me inserting roughly 5M keys: > > > > # keys dictionary metakit (both using psyco) > > ------ ---------- ------- > > 1M 8.8s 22.2s > > 2M 24.0s 43.7s > > 5M 115.3s 105.4s > > > > Has anyone written a fast hash module which is more optimal for > > large datasets ? > > > > p.s. Disk-based DBs are out of the question because most > > key lookups will result in a miss, and lookup time is > > critical for this application. > > > > Cheers, > > Chris > Python Bindings (\Python24\Lib\bsddb vers. 4.3.0) and the DLL for > BerkeleyDB (\Python24\DLLs\_bsddb.pyd vers. 4.2.52) are included in the > standard Python 2.4 distribution. > > "Berkeley DB was 20 times faster than other databases. It has the > operational speed of a main memory database, the startup and shut down > speed of a disk-resident database, and does not have the overhead of > a client-server inter-process communication." > Ray Van Tassle, Senior Staff Engineer, Motorola > > Please let me/us know if it is what you are looking for. > > Claudio
sqlite also supports an in-memory database - use pysqlite (http://initd.org/tracker/pysqlite/wiki) to access this from Python. -- Paul -- http://mail.python.org/mailman/listinfo/python-list