New submission from Serhiy Storchaka <storch...@gmail.com>: codecs.charmap_decode behaves differently with native and user string as decode table.
>>> import codecs >>> print(ascii(codecs.charmap_decode(b'\x00', 'replace', '\uFFFE'))) ('\ufffd', 1) >>> class S(str): pass ... >>> print(ascii(codecs.charmap_decode(b'\x00', 'replace', S('\uFFFE')))) ('\ufffe', 1) It's because charmap decoder (function PyUnicode_DecodeCharmap in Objects/unicodeobject.c) uses different algorithms for exact strings and for other. We need to fix it? If yes, what should return `codecs.charmap_decode(b'\x00', 'replace', {0:'\uFFFE'})`? What should return `codecs.charmap_decode(b'\x00', 'replace', {0:0xFFFE})`? ---------- components: Interpreter Core messages: 161054 nosy: storchaka priority: normal severity: normal status: open title: The inconsistency of codecs.charmap_decode type: behavior versions: Python 2.7, Python 3.2, Python 3.3 _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue14850> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com