New submission from Serhiy Storchaka: An error handler in unicode_escape_decode() eats at least one byte (or more) after illegal escape sequence.
>>> import codecs >>> codecs.unicode_escape_decode(br'\u!@#', 'replace') ('�', 5) >>> codecs.unicode_escape_decode(br'\u!@#$', 'replace') ('�@#$', 6) raw_unicode_escape_decode() works right: >>> codecs.raw_unicode_escape_decode(br'\u!@#', 'replace') ('�!@#', 5) >>> codecs.raw_unicode_escape_decode(br'\u!@#$', 'replace') ('�!@#$', 6) See also issue16975. ---------- assignee: serhiy.storchaka components: Unicode messages: 180077 nosy: ezio.melotti, serhiy.storchaka priority: normal severity: normal stage: needs patch status: open title: Broken error handling in codecs.unicode_escape_decode() type: behavior versions: Python 2.7, Python 3.2, Python 3.3, Python 3.4 _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue16979> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com