STINNER Victor added the comment: > The surrogateescape error handler is dangerous with utf-16/32. It can produce > globally invalid output.
I don't understand, can you give an example? surrogateescape generate invalid encoded string with any encoding. Example with UTF-8: >>> b"a\xffb".decode("utf-8", "surrogateescape") 'a\udcffb' >>> 'a\udcffb'.encode("utf-8", "surrogateescape") b'a\xffb' >>> b'a\xffb'.decode("utf-8") Traceback (most recent call last): File "<stdin>", line 1, in <module> UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 1: invalid start byte So str.encode("utf-8", "surrogateescape") produces an invalid UTF-8 sequence. ---------- _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue18713> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com