braver <[EMAIL PROTECTED]> wrote: > In many cases, you want to do this: > > for line in f: > <do something with the line, setup counts and things> > if line % 1000 == 0 or f.eof(): # eof() doesn't exist in Python > yet! > <use the setup variables and things to process the chunk> > > My control logic summarizes every 1000 lines of a file. I have to > issue the summary after each 1000 lines, or whatever incomplete tail > chunk remains. If I do it after the for loop, I have to refactor my > logic into a procedure to call it twice. Now I want to avoid the > overhead of the procedure call, and generally for a script to keep it > simple.
This sounds like a case for writing a generator. Try this one: ----- begin chunks.py ------- import itertools def chunks(f, size): iterator = iter(f) def onechunk(line): yield line for line in itertools.islice(iterator, size-1): yield line for line in iterator: yield onechunk(line) for chunk in chunks(open('chunks.py'), 3): for n, line in enumerate(chunk): print "%d:%s" % (n,line.rstrip()) print "---------------" print "done" #eof ------ end chunks.py -------- Ths output when you run this is: C:\Temp>chunks.py 0:import itertools 1:def chunks(f, size): 2: iterator = iter(f) --------------- 0: def onechunk(line): 1: yield line 2: for line in itertools.islice(iterator, size-1): --------------- 0: yield line 1: for line in iterator: 2: yield onechunk(line) --------------- 0: 1:for chunk in chunks(open('chunks.py'), 3): 2: for n, line in enumerate(chunk): --------------- 0: print "%d:%s" % (n,line.rstrip()) 1: print "---------------" 2:print "done" --------------- 0:#eof --------------- done Or change it to do: for chunk in chunks(enumerate(open('chunks.py')), 3): for n, line in chunk: and you get all lines numbered from 0 to 15 instead of resetting the count each chunk. -- http://mail.python.org/mailman/listinfo/python-list