On 13.04.2015 19:08, Peter Otten wrote:
How about a file-based workflow?
Write distinct scripts, e. g.
a2b.py that reads from *.a and writes to *.b
and so on. Then use a plain old makefile to define the dependencies.
Whether .a uses pickle, .b uses json, and .z uses csv is but an
implementation detail that only its producers and consumers need to know.
Testing an arbitrary step is as easy as invoking the respective script with
some prefabricated input and checking the resulting output file(s).
I think I like the idea because it is more durable. The data I
manipulate comes with specific formats which are very efficient. With
the pickle I was kind of "lazy" and, well, saved a couple of read/write
routines.
Still, your idea is probably more elegant.
With multiprocessing, do I have to care about processes writing
simultaneously in *different* files? I guess the OS takes good care of
this stuff but I'm not an expert.
Tahnks,
Fabien
--
https://mail.python.org/mailman/listinfo/python-list