On Wed, Sep 25, 2013 at 4:43 AM, <wxjmfa...@gmail.com> wrote: > - The *mark* (once the Unicode.org terminology in FAQ) indicating > a unicode encoded raw text file is neither a byte order mark, > nor a signature, it is an encoded code point, the encoded > U+FEFF, 'ZERO WIDTH NO-BREAK SPACE', code point. (Note, a > non breaking space at the start of a text is a non sense.) > > - When such a mark exists, it is always possible to work > 100% safely. No possible error.
I have a file encoded in Latin-1 which begins with LATIN SMALL LETTER Y WITH DIAERESIS followed by LATIN SMALL LETTER THORN. I also have a file encoded in EBCDIC (okay, I don't really, but let's pretend) that begins with the same bytes. But of course, when such a mark exists, there is no possible error - of that there is no manner of doubt, no possible, probable shadow of doubt, no possible doubt whatever. ("No possible doubt whatever.") ChrisA -- https://mail.python.org/mailman/listinfo/python-list