A Monday 12 November 2007, Michael Bacarella escrigué: > As for the solution, after trying a half-dozen different integer > hashing functions > and hash table sizes (the brute force approach), on a total whim I > switched to a > model with two dictionary tiers and got whole orders of magnitude > better performance. > > The tiering is, for a given key of type long: > > id2name[key >> 40][key & 0x10000000000] = name > > Much, much better. A few minutes versus hours this way. > > I suspect it could be brought down to seconds with a third level of > tiers but this is no longer posing the biggest bottleneck... ;)
I don't know exactly why do you need a dictionary for keeping the data, but in case you want ultra-fast access to values, there is no replacement for keeping a sorted list of keys and a list with the original indices to values, and the proper list of values. Then, to access a value, you only have to do a binary search on the sorted list, another lookup in the original indices list and then go straight to the value in the value list. This should be the faster approach I can think of. Another possibility is using an indexed column in a table in a DB. Lookups there should be much faster than using a dictionary as well. HTH, -- >0,0< Francesc Altet http://www.carabos.com/ V V Cárabos Coop. V. Enjoy Data "-" -- http://mail.python.org/mailman/listinfo/python-list