Re: parallel csv-file processing

2007-11-09 Thread Paul Boddie
On 9 Nov, 12:02, Paul Rubin wrote: > > Why not pass the disk offsets to the job server (untested): > >n = 1000 >for i,_ in enumerate(reader): > if i % n == 0: >job_server.submit(calc_scores, reader.tell(), n) > > the remote process seeks to the approp

Re: parallel csv-file processing

2007-11-09 Thread Marc 'BlackJack' Rintsch
On Fri, 09 Nov 2007 02:51:10 -0800, Michel Albert wrote: > Obviously this won't work as you cannot access a slice of a csv-file. > Would it be possible to subclass the csv.reader class in a way that > you can somewhat efficiently access a slice? An arbitrary slice? I guess not as all records bef

Re: parallel csv-file processing

2007-11-09 Thread Paul Rubin
Michel Albert <[EMAIL PROTECTED]> writes: > buffer = [] > for line in reader: >buffer.append(line) >if len(buffer) == 1000: > f = job_server.submit(calc_scores, buffer) > buffer = [] > > f = job_server.submit(calc_scores, buffer) > buffer = [] > > but would this not kill my me