Michel Albert <[EMAIL PROTECTED]> writes: > buffer = [] > for line in reader: > buffer.append(line) > if len(buffer) == 1000: > f = job_server.submit(calc_scores, buffer) > buffer = [] > > f = job_server.submit(calc_scores, buffer) > buffer = [] > > but would this not kill my memory if I start loading bigger slices > into the "buffer" variable?
Why not pass the disk offsets to the job server (untested): n = 1000 for i,_ in enumerate(reader): if i % n == 0: job_server.submit(calc_scores, reader.tell(), n) the remote process seeks to the appropriate place and processes n lines starting from there. -- http://mail.python.org/mailman/listinfo/python-list