On Wed, 13 Jul 2005 03:49:16 +0200, Thomas Lotze <[EMAIL PROTECTED]> wrote:
>Scott David Daniels wrote: > >> Now if you want to do it for a file, you could do: >> >> for c in thefile.read(): >> .... > >The whole point of the exercise is that seeking on a file doesn't >influence iteration over its content. In the loop you suggest, I can >seek() on thefile to my heart's content and will always get its content >iterated over exactly from beginning to end. It had been read before any >of this started, after all. Similarly, thefile.tell() will always tell me >thefile's size or the place I last seek()'ed to instead of the position of >the next char I will get. > What I suggested in my other post (untested beyond what you see, so you may want to add to the test ): ----< lotzefile.py >-------------------------------------------------- class LotzeFile(file): BUFSIZE = 4096 def __init__(self, path, mode='r'): self.f = file(path, mode) self.pos = self.bufbase = 0 self.buf = '' def __iter__(self): return self def next(self): if not self.buf[self.pos:]: self.bufbase += len(self.buf) self.pos = 0 self.buf = self.f.read(self.BUFSIZE) if not self.buf: self.close() raise StopIteration byte = self.buf[self.pos] self.pos += 1 return byte def seek(self, pos, ref=0): self.f.seek(pos, ref) self.bufbase = self.f.tell() self.pos = 0 self.buf = '' def tell(self): return self.bufbase + self.pos def close(self): self.f.close() def test(): f = file('lotzedata.txt','w') for s in (' %3d'%i for i in xrange(1000)): f.write(s) f.close() it = iter(LotzeFile('lotzedata.txt')) hold4=[0,0,0,0] for i, c in enumerate(it): hold4[i%4] = c if i%4==3: print hold4 assert (i-3)/4 == int(''.join(hold4)) if i == 99: break print it.tell() it.seek(52) for i in xrange(8): print it.next(), print it.seek(990*4) for c in it: print c, if __name__ == '__main__': test() ---------------------------------------------------------------------- Result: [20:53] C:\pywk\clp>py24 lotze.py [' ', ' ', ' ', '0'] [' ', ' ', ' ', '1'] [' ', ' ', ' ', '2'] [' ', ' ', ' ', '3'] [' ', ' ', ' ', '4'] [' ', ' ', ' ', '5'] [' ', ' ', ' ', '6'] [' ', ' ', ' ', '7'] [' ', ' ', ' ', '8'] [' ', ' ', ' ', '9'] [' ', ' ', '1', '0'] [' ', ' ', '1', '1'] [' ', ' ', '1', '2'] [' ', ' ', '1', '3'] [' ', ' ', '1', '4'] [' ', ' ', '1', '5'] [' ', ' ', '1', '6'] [' ', ' ', '1', '7'] [' ', ' ', '1', '8'] [' ', ' ', '1', '9'] [' ', ' ', '2', '0'] [' ', ' ', '2', '1'] [' ', ' ', '2', '2'] [' ', ' ', '2', '3'] [' ', ' ', '2', '4'] 100 1 3 1 4 9 9 0 9 9 1 9 9 2 9 9 3 9 9 4 9 9 5 9 9 6 9 9 7 9 9 8 9 9 9 I suspect you could get better performance if you made LotzeFile instances able to return interators over buffer chunks and get characters from them, which would be string iterators supplying the characters rather than the custom .next, but the buffer chunks would have to be of some size to make that pay. Testing is the only way to find out what the crossing point is, if you really have to. Regards, Bengt Richter -- http://mail.python.org/mailman/listinfo/python-list