On Aug 6, 10:56 pm, Michael Torrie <torr...@gmail.com> wrote: > On 08/06/2010 07:56 PM, dmtr wrote: > > > Ultimately a dict that can store ~20,000,000 entries: (u'short > > string' : (int, int, int, int, int, int, int)). > > I think you really need a real database engine. With the proper > indexes, MySQL could be very fast storing and retrieving this > information for you. And it will use your RAM to cache as it sees fit. > Don't try to reinvent the wheel here.
No, I've tried. DB solutions are not even close in terms of the speed. Processing would take weeks :( Memcached or REDIS sort of work, but they are still a bit on the slow side, to be a pleasure to work with. The standard dict() container is *a lot* faster. It is also hassle free (accepting unicode keys/etc). I just wish there was a bit more compact dict container, optimized for large dataset and memory, not for speed. And with the default dict() I'm also running into some kind of nonlinear performance degradation, apparently after 10,000,000-13,000,000 keys. But I can't recreate this with a solid test case (see http://bugs.python.org/issue9520 ) :( -- Dmitry -- http://mail.python.org/mailman/listinfo/python-list