Dmitry Vasiliev <[EMAIL PROTECTED]> added the comment: Actually RFC-977 said all characters must be in ASCII, but RFC-3977 changed default character set to UTF-8. So I think UTF-8 must be default encoding, not Latin-1. Moreover Latin-1 can silently hide a real encoding, for example:
>>> u'\u0422\u0435\u0441\u0442'.encode("koi8-r").decode("latin1") u'\xf4\xc5\xd3\xd4' Additionally in the future it would be a good idea to look in the article headers for article body encoding. _______________________________________ Python tracker <[EMAIL PROTECTED]> <http://bugs.python.org/issue3714> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com