Thanks Peter, for this very helpful reply and for pointing out _pyio.py to me! It's great to be able to check implementation details sometimes. So, if I understand you correctly, I can simply import io and open files with io.open() - instead of open and although this is a bit a detour in Python3 - and this will ensure version-independent behavior of my code? That´s cool! What will my IO object return then when I read from it in Python 2.7? str where Python3 gives bytes, and unicode instead of str ? This is what I understood from the Python 2.7 io module doc.
-----Original Message----- From: Peter Otten [mailto:__pete...@web.de] Sent: Thursday, January 17, 2013 1:04 PM To: python-list@python.org Subject: Re: iterating over the lines of a file - difference between Python 2.7 and 3? You can get the Python 3 behaviour with io.open() in Python 2.7. There is an implementation in Python in _pyio.py: def tell(self): return _BufferedIOMixin.tell(self) - len(self._read_buf) + self._read_pos Wolfgang Maier wrote: > I just came across an unexpected behavior in Python 3.3, which has to > do with file iterators and their interplay with other methods of > file/IO class methods, like readline() and tell(): Basically, I got > used to the fact that it is a bad idea to mix them because the > iterator would use that hidden read-ahead buffer, so what you got with > subsequent calls to > readline() or tell() was what was beyond that buffer, but not the next > thing after what the iterator just returned. > > Example: > > in_file_object=open(‘some_file’,’rb’) > > for line in in_file_object: > > print (line) > > if in_file_object.tell() > 300: > > # assuming that individual lines are > # shorter > > break > > > > This wouldn´t print anything in Python 2.7 since next(in_file_object) > would read ahead beyond the 300 position immediately, as evidenced by > a subsequent call to in_file_object.tell() (returning 8192 on my system). > > However, I find that under Python 3.3 this same code works: it prints > some lines from my file and after completing in_file_object.tell() > returns a quite reasonable 314 as the current position in the file. > > I couldn´t find this difference anywhere in the documentation. Is the > 3.3 behavior official, and if so, when was it introduced and how is it > implemented? I assume the read-ahead buffer still exists? > > By the way, the 3.3 behavior only works in binary mode. In text mode, the > code will raise an OSError: telling position disabled by next() call. In > Python 2.7 there was no difference between the binary and text mode > behavior. Could not find this documented either. -- http://mail.python.org/mailman/listinfo/python-list