On Dec 6, 10:35 am, Steven D'Aprano <[EMAIL PROTECTED] cybersource.com.au> wrote: > On Fri, 05 Dec 2008 12:00:59 -0700, Joe Strout wrote: > >> So UTF-16 has an explicit EOF marker within the text? > > > No, it does not. I don't know what Terry's thinking of there, but text > > files do not have any EOF marker. They start at the beginning > > (sometimes including a byte-order mark), and go till the end of the > > file, period. > > Windows text files still interpret ctrl-Z as EOF, or at least Windows XP > does. Vista, who knows?
This applies only to files being read in an 8-bit text mode. It is inherited from MS-DOS, which followed the CP/M convention, which was necessary because CP/M's file system recorded only the physical file length in 128-byte sectors, not the logical length. It is likely to continue in perpetuity, just as standard railway gauge is (allegedly) based on the axle-length of Roman chariots. None of this is relevant to the OP's problem; his file appears to have been truncated rather than having spurious data appended to it. -- http://mail.python.org/mailman/listinfo/python-list