pickle question

Marc 'BlackJack' Rintsch Fri, 15 Jun 2007 03:21:13 -0700

In <[EMAIL PROTECTED]>, lazy wrote:

> I have a dictionary something like this,
> 
> key1=>{key11=>[1,2] , key12=>[6,7] , ....  }
> For lack of wording, I will call outer dictionary as dict1 and its
> value(inner dictionary) dict2 which is a dictionary of small fixed
> size lists(2 items)
> 
> The key of the dictionary is a string and value is another dictionary
> (lets say dict2)
> dict2 has a string key and a list of 2 integers.
> 
> Im processesing  HUGE(~100M inserts into the dictionary) data.
> I tried 2 options both seem to be slower and Im seeking suggestions to
> improve the speed. The code is sort of in bits and pieces, so Im just
> giving the idea.
> 
> […]
> 
> This is not getting to speed even with option 2. Before inserting, I
> do some processing on the line, so the bottleneck is not clear to me,
> (i.e in processing or inserting to db). But I guess its mainly because
> of pickling and unpickling.
> 
> Any suggestions will be appreciated :)


I guess your guess about the pickling as bottleneck is correct but
measuring/profiling will give more confidence.

Maybe another database than bsddb might be useful here.  An SQL one like
SQLite or maybe an object DB like zodb or Durus.

Ciao,
        Marc 'BlackJack' Rintsch
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: huge dictionary -> bsddb/pickle question

Reply via email to