I thought I was done with this crap once I moved to 3.x but some Winblows machines are still sending what some circles call "Extended ASCII". I have a file that I am trying to read and it is barfing on some characters. For example:

  due to the Qu\xe9bec government

Obviously should be "due to the Québec government". I can't figure out what that encoding is or if it is anything that can even be understood outside of M$. I have tried ascii, cp437, cp858, cp1140, cp1250, latin-1, utf8 and others. None of them recognize that character. Can someone tell me what encoding includes that character please.

Here is the failing code:

with open(sys.argv[1], encoding="latin-1") as fp:
  for ln in fp:
    print(ln)

Traceback (most recent call last):
  File "./load_iff", line 11, in <module>
    print(ln)
UnicodeEncodeError: 'ascii' codec can't encode character '\xe9' in position 132: ordinal not in range(128)

I don't understand why the error says "ascii" when I told it to use "latin-1".

--
D'Arcy J.M. Cain
Vybe Networks Inc.
http://www.VybeNetworks.com/
IM:da...@vex.net VoIP: sip:da...@vybenetworks.com
--
https://mail.python.org/mailman/listinfo/python-list

Reply via email to