Re: Pickle based workflow - looking for advice

2015-04-14 Thread Chris Angelico
On Wed, Apr 15, 2015 at 12:14 AM, Steven D'Aprano wrote: > On Tue, 14 Apr 2015 11:45 pm, Chris Angelico wrote: > >> On Tue, Apr 14, 2015 at 11:08 PM, Steven D'Aprano >> wrote: >>> On Tue, 14 Apr 2015 05:58 pm, Fabien wrote: >>> On 14.04.2015 06:05, Chris Angelico wrote: > Not sure what y

Re: Pickle based workflow - looking for advice

2015-04-14 Thread Steven D'Aprano
On Tue, 14 Apr 2015 11:45 pm, Chris Angelico wrote: > On Tue, Apr 14, 2015 at 11:08 PM, Steven D'Aprano > wrote: >> On Tue, 14 Apr 2015 05:58 pm, Fabien wrote: >> >>> On 14.04.2015 06:05, Chris Angelico wrote: Not sure what you mean, here. Any given file will be written by exactly one p

Re: Pickle based workflow - looking for advice

2015-04-14 Thread Chris Angelico
On Tue, Apr 14, 2015 at 11:08 PM, Steven D'Aprano wrote: > On Tue, 14 Apr 2015 05:58 pm, Fabien wrote: > >> On 14.04.2015 06:05, Chris Angelico wrote: >>> Not sure what you mean, here. Any given file will be written by >>> exactly one process? No possible problem. Multiprocessing within one >>> ap

Re: Pickle based workflow - looking for advice

2015-04-14 Thread Steven D'Aprano
On Tue, 14 Apr 2015 05:58 pm, Fabien wrote: > On 14.04.2015 06:05, Chris Angelico wrote: >> Not sure what you mean, here. Any given file will be written by >> exactly one process? No possible problem. Multiprocessing within one >> application doesn't change that. > > yes that's what I meant. Than

Re: Pickle based workflow - looking for advice

2015-04-14 Thread Fabien
On 14.04.2015 06:05, Chris Angelico wrote: Not sure what you mean, here. Any given file will be written by exactly one process? No possible problem. Multiprocessing within one application doesn't change that. yes that's what I meant. Thanks! -- https://mail.python.org/mailman/listinfo/python-li

Re: Pickle based workflow - looking for advice

2015-04-13 Thread Chris Angelico
On Tue, Apr 14, 2015 at 3:35 AM, Fabien wrote: > With multiprocessing, do I have to care about processes writing > simultaneously in *different* files? I guess the OS takes good care of this > stuff but I'm not an expert. Not sure what you mean, here. Any given file will be written by exactly one

Re: Pickle based workflow - looking for advice

2015-04-13 Thread Fabien
On 13.04.2015 19:08, Peter Otten wrote: How about a file-based workflow? Write distinct scripts, e. g. a2b.py that reads from *.a and writes to *.b and so on. Then use a plain old makefile to define the dependencies. Whether .a uses pickle, .b uses json, and .z uses csv is but an implementatio

Re: Pickle based workflow - looking for advice

2015-04-13 Thread Fabien
On 13.04.2015 17:45, Devin Jeanpierre wrote: On Mon, Apr 13, 2015 at 10:58 AM, Fabien wrote: >Now, to my questions: >1. Does that seem reasonable? A big issue is the use of pickle, which is: * Often suboptimal performance wise (e.g. you can't load only subsets of the data) * Makes forwards/ba

Re: Pickle based workflow - looking for advice

2015-04-13 Thread Fabien
On 13.04.2015 18:25, Dave Angel wrote: On 04/13/2015 10:58 AM, Fabien wrote: Folks, A comment. Pickle is a method of creating persistent data, most commonly used to preserve data between runs. A database is another method. Although either one can also be used with multiprocessing, you seem

Re: Pickle based workflow - looking for advice

2015-04-13 Thread Peter Otten
Fabien wrote: > I am writing a quite extensive piece of scientific software. Its > workflow is quite easy to explain. The tool realizes series of > operations on watersheds (such as mapping data on it, geostatistics and > more). There are thousands of independent watersheds of different size, > an

Re: Pickle based workflow - looking for advice

2015-04-13 Thread Dave Angel
On 04/13/2015 10:58 AM, Fabien wrote: Folks, A comment. Pickle is a method of creating persistent data, most commonly used to preserve data between runs. A database is another method. Although either one can also be used with multiprocessing, you seem to be worrying more about the mechan

Re: Pickle based workflow - looking for advice

2015-04-13 Thread Robin Becker
for what it's worth I believe that marshal is a faster method for storing simple python objects. So if your information can be stored using simple python things eg strings, floats, integers, lists and dicts then storage using marshal is faster than pickle/cpickle. If you want to persist the obje

Re: Pickle based workflow - looking for advice

2015-04-13 Thread Devin Jeanpierre
On Mon, Apr 13, 2015 at 10:58 AM, Fabien wrote: > Now, to my questions: > 1. Does that seem reasonable? A big issue is the use of pickle, which is: * Often suboptimal performance wise (e.g. you can't load only subsets of the data) * Makes forwards/backwards compatibility very difficult * Can mak

Pickle based workflow - looking for advice

2015-04-13 Thread Fabien
Folks, I am writing a quite extensive piece of scientific software. Its workflow is quite easy to explain. The tool realizes series of operations on watersheds (such as mapping data on it, geostatistics and more). There are thousands of independent watersheds of different size, and the size d