STINNER Victor <victor.stin...@haypocalc.com> added the comment: Example on Windows Vista with ANSI=cp932:
>>> import codecs >>> codecs.code_page_encode(1252, '\xe9') (b'\xe9', 1) >>> codecs.mbcs_encode('\xe9') ... UnicodeEncodeError: 'mbcs' codec can't encode characters in position 0--1: invalid character >>> codecs.code_page_encode(932, '\xe9') ... UnicodeEncodeError: 'cp932' codec can't encode characters in position 0--1: invalid character >>> codecs.code_page_encode(932, '\xe9', 'replace') (b'e', 1) >>> codecs.code_page_encode(932, '\xe9', 'ignore') (b'', 8) >>> codecs.code_page_encode(932, '\xe9', 'backslashreplace') (b'\\xe9', 8) You can use a code page different than the ANSI code page. The encoding name is generated from the code page number: "cp%u" % code_page, or "mbcs" if code_page == CP_ACP. (Oops, I forgot a printf() in mbcs2.patch) ---------- _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue12281> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com