On Wed, 01 Aug 2007 15:47:21 -0800, Joshua J. Kugler wrote: > My original data is 33MB. When each row is converted to python lists, and > inserted into a shelve DB, it balloons to 69MB. Now, there is some > additional data in there namely a list of all the keys containing data (vs. > the keys that contain version/file/config information), BUT if I copy all > the data over to a dict and dump the dict to a file using cPickle, that > file is only 49MB. I'm using pickle protocol 2 in both cases. > > Is this expected? Is there really that much overhead to using shelve and dbm > files? Are there any similar solutions that are more space efficient? I'd > use straight pickle.dump, but loading requires pulling the entire thing > into memory, and I don't want to have to do that every time.
You did not say how many records you store. If the underlying DB used by `shelve` works with a hash table it may be expected to see that "bloat". It's a space vs. speed trade off then. Ciao, Marc 'BlackJack' Rintsch -- http://mail.python.org/mailman/listinfo/python-list