Re: Parallel/Multiprocessing script design question

2007-09-13 Thread Paddy
Hi Amit, Why not create a list of those 800+ files and a sccript that when run, looks for an environment variable that will be a number from 1 to 800, selectss the file at that line number and processes it fully. For the process control install a job scheduler such as LSF or the Sun grid Engine ht

Re: Parallel/Multiprocessing script design question

2007-09-13 Thread Ivan Voras
Amit N wrote: About 800+ 10-15MB files are generated daily that need to be processed. The processing consists of different steps that the files must go through: -Uncompress -FilterA -FilterB -Parse -Possibly compress parsed files for archival You can implement one of two easy straightforward

Re: Parallel/Multiprocessing script design question

2007-09-13 Thread A.T.Hofkamp
On 2007-09-13, Amit N <[EMAIL PROTECTED]> wrote: > Hi guys, > > I tend to ramble, and I am afraid none of you busy experts will bother > reading my long post, so I will try to summarize it first: I haven't read the details, but you seem to aim for a single python program that does 'it'. A single

Re: Parallel/Multiprocessing script design question

2007-09-12 Thread Martin v. Löwis
> I tend to ramble, and I am afraid none of you busy experts will bother > reading my long post I think that's a fairly accurate description, and prediction. > I am hoping people > with experience using any of these would chime in with tips. The main thing > I would look for in a toolkit is ma

Parallel/Multiprocessing script design question

2007-09-12 Thread Amit N
Hi guys, I tend to ramble, and I am afraid none of you busy experts will bother reading my long post, so I will try to summarize it first: 1. I have a script that processes ~10GB of data daily, and runs for a long time that I need to parallelize on a multicpu/multicore system. I am trying to d