Re: cp936 uses gbk codec, doesn't decode `\x80` as U+20AC EURO SIGN

2010-10-11 Thread Ulrich Eckhardt
John Machin wrote: > |>>> '\x80'.decode('cp936') > Traceback (most recent call last): > File "", line 1, in > UnicodeDecodeError: 'gbk' codec can't decode byte 0x80 > in position 0: incomplete multibyte sequence [...] > So Microsoft appears to think that > cp936 includes the euro, > and the ICU

cp936 uses gbk codec, doesn't decode `\x80` as U+20AC EURO SIGN

2010-10-10 Thread John Machin
|>>> '\x80'.decode('cp936') Traceback (most recent call last): File "", line 1, in UnicodeDecodeError: 'gbk' codec can't decode byte 0x80 in position 0: incomplete multibyte sequence However: Retrieved 2010-10-10 from http://www.unicode.org/Public /MAPPINGS/VENDORS/MICSFT/WINDOWS/CP936.TXT