On 2006-10-16, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote: > Hello, > > here is something that surprises me. > > #coding: iso-8859-1
I think that's supposed to be: # -*- coding: iso-8859-1 -*- The special comment changes only the encoding of unicode literals. In particular, it doesn't change the default encoding of str literals. > s1=u"Frau Müller machte große Augen" > s2="Frau Müller machte große Augen" > if s1 == s2: > pass On my machine, the ü and ß in s2 are being stored in the code points of my terminal's encoding, cp437. Unforunately cp437 code points from 127-255 are not the same as those in iso-8859-1. To fix this, I have to do the following: >>> s1 == s2.decode('cp437') True > Running this code produces a UnicodeDecodeError: > > Traceback (most recent call last): > File "tmp.py", line 4, in ? > if s1 == s2: > UnicodeDecodeError: 'ascii' codec can't decode byte 0xfc in position 6: > ordinal not in range(128) > > I would have expected that "s1 == s2" gives True... or maybe > False... but raising an error here is unnecessary. I guess that > the comparison operator decides to convert s2 to a Unicode but > forgets that I said #coding: iso-8859-1 at the beginning of the > file. It's trying to interpret s2 as ascii, and failing, since 129 and 225 code points are out of range. -- Neil Cerutti -- http://mail.python.org/mailman/listinfo/python-list