En Wed, 17 Mar 2010 19:04:14 -0300, Keir Vaughan-taylor <kei...@gmail.com>
escribió:
I am traversing a large set of directories using
for root, dirs, files in os.walk(basedir):
run program
Being a huge directory set the traversal is taking days to do a
traversal.
Sometimes it is the case there is a crash because of a programming
error.
As each directory is processed the name of the directory is written to
a file
I want to be able to restart the walk from the directory where it
crashed.
Is this possible?
If the 'dirs' list were guaranteed to be sorted, you could remove at each
level all previous directories already traversed. But it's not :(
Perhaps a better approach would be, once, collect all directories to be
processed and write them on a text file -- these are the pending
directories. Then, read from the pending file and process every directory
in it. If the process aborts for any reason, manually delete the lines
already processed and restart.
If you use a database instead of a text file, and mark entries as "done"
after processing, you can avoid that last manual step and the whole
process may be kept running automatically. In some cases you may want to
choose the starting point at random.
--
Gabriel Genellina
--
http://mail.python.org/mailman/listinfo/python-list