Wolfgang Maier wrote: > I just came across an unexpected behavior in Python 3.3, which has to do > with file iterators and their interplay with other methods of file/IO > class methods, like readline() and tell(): Basically, I got used to the > fact that it is a bad idea to mix them because the iterator would use that > hidden read-ahead buffer, so what you got with subsequent calls to > readline() or tell() was what was beyond that buffer, but not the next > thing after what the iterator just returned. > > Example: > > in_file_object=open(some_file,rb) > > for line in in_file_object: > > print (line) > > if in_file_object.tell() > 300: > > # assuming that individual lines are > # shorter > > break > > > > This wouldn´t print anything in Python 2.7 since next(in_file_object) > would read ahead beyond the 300 position immediately, as evidenced by a > subsequent call to in_file_object.tell() (returning 8192 on my system). > > However, I find that under Python 3.3 this same code works: it prints some > lines from my file and after completing in_file_object.tell() returns a > quite reasonable 314 as the current position in the file. > > I couldn´t find this difference anywhere in the documentation. Is the 3.3 > behavior official, and if so, when was it introduced and how is it > implemented? I assume the read-ahead buffer still exists?
You can get the Python 3 behaviour with io.open() in Python 2.7. There is an implementation in Python in _pyio.py: def tell(self): return _BufferedIOMixin.tell(self) - len(self._read_buf) + self._read_pos > By the way, the 3.3 behavior only works in binary mode. In text mode, the > code will raise an OSError: telling position disabled by next() call. In > Python 2.7 there was no difference between the binary and text mode > behavior. Could not find this documented either. -- http://mail.python.org/mailman/listinfo/python-list