John Machin wrote: > On Mar 21, 9:25 am, Jim Garrison <j...@acm.org> wrote: >> I'm converting a Perl system to Python, and have run into a severe >> performance problem with pickle. >> >> One facet of the system involves scanning and loading into memory a >> couple of parallel directory trees containing OTO 10^4 files. The >> trees don't change during development/testing and the scan takes 30-40 >> seconds, so to save time I cache the loaded tree structure to disk, in >> Perl with module Storable, and in Python with pickle. >> >> In Perl, the save operation produces a file of about 3MB, and both >> save and restore take a second or two. In Python, pickle.dump() >> produces a similar-size file but takes 20 seconds, and pickle.load() >> takes 45 seconds, which is actually LONGER than the time required to >> scan the directory trees. >> >> Is there anything I can do to speed up pickle.load() to get >> performance comparable to Perl's Storable? > > Have you read this: > http://www.python.org/doc/2.6/library/pickle.html > ? > Have you considered using cPickle instead of pickle? > Have you considered using *ickle.dump(..., protocol=-1) ?
I'm using Python 3 on Windows (Server 2003). According to the docs "The pickle module has an transparent optimizer (_pickle) written in C. It is used whenever available. Otherwise the pure Python implementation is used." How can I tell if _pickle is being used? -- http://mail.python.org/mailman/listinfo/python-list