John Machin <sjmac...@lexicon.net> wrote: > On Jun 8, 12:13 am, "R. David Murray" <rdmur...@bitdance.com> wrote: > > higer <higerinbeij...@gmail.com> wrote: > > > My file contains such strings : > > > \xe6\x97\xa5\xe6\x9c\x9f\xef\xbc\x9a > > > > If those bytes are what is in the file (and it sounds like they are), > > then the data in your file is not in UTF8 encoding, it is in ASCII > > encoded as hexidecimal escape codes. > > OK, I'll bite: what *ASCII* character is encoded as either "\xe6" or > r"\xe6" by what mechanism in which parallel universe?
Well, you are correct that the OP might have had trouble parsing my English. My English is more or less valid ("[the file] is _in_ ASCII", ie: consists of ASCII characters, "encoded as hexideicmal escape codes", which specifies the encoding used). But better perhaps would have been to just say that the data is encoded as hexidecimal escape sequences. --David -- http://mail.python.org/mailman/listinfo/python-list