Hi Amit,
Why not create a list of those 800+ files and a sccript that when run,
looks for an environment variable that will be a number from 1 to 800,
selectss the file at that line number and processes it fully.
For the process control install a job scheduler such as LSF or the Sun
grid Engine ht
Amit N wrote:
About 800+ 10-15MB files are generated daily that need to be processed. The
processing consists of different steps that the files must go through:
-Uncompress
-FilterA
-FilterB
-Parse
-Possibly compress parsed files for archival
You can implement one of two easy straightforward
On 2007-09-13, Amit N <[EMAIL PROTECTED]> wrote:
> Hi guys,
>
> I tend to ramble, and I am afraid none of you busy experts will bother
> reading my long post, so I will try to summarize it first:
I haven't read the details, but you seem to aim for a single python program
that does 'it'. A single
> I tend to ramble, and I am afraid none of you busy experts will bother
> reading my long post
I think that's a fairly accurate description, and prediction.
> I am hoping people
> with experience using any of these would chime in with tips. The main thing
> I would look for in a toolkit is ma
Hi guys,
I tend to ramble, and I am afraid none of you busy experts will bother
reading my long post, so I will try to summarize it first:
1. I have a script that processes ~10GB of data daily, and runs for a long
time that I need to parallelize on a multicpu/multicore system. I am trying
to d