Klaus Neuner wrote:
Yet, I have got 43 such files. Together they are 4,1M
large. In the future, they will probably become much larger. At the moment, the process takes several hours. As it is a process
that I have to run very often, I would like it to be faster.
Others have shown how you can make your dictionary code more efficient, which should provide a big speed boost, especially if there are many keys in your dicts.
However, if you're taking this long to read files each time, perhaps there's a better high-level approach than just a brute-force scan of every file every time. You don't say anything about where those files are coming from, or how they're created. Are they relatively static? (That is to say, are they (nearly) the same files being read on each run?) Do you control the process that creates the files? Given the right conditions, you may be able to store your data in a shelve, or even proper database, saving you lots of time in parsing through these files on each run. Even if it's entirely new data on each run, you may be able to find a more efficient way of transferring data from whatever the source is into your program.
Jeff Shannon Technician/Programmer Credit International
-- http://mail.python.org/mailman/listinfo/python-list