New submission from Neil Muller <drnlmuller+b...@gmail.com>: In py3k, ElementTree no longer correctly converts characters to entities when they can't be represented in the requested output encoding.
Python 2: >>> import xml.etree.ElementTree as ET >>> e = ET.XML("<?xml version='1.0' encoding='iso-8859-1'?><body>t\xe3t</body>") >>> ET.tostring(e, 'ascii') "<?xml version='1.0' encoding='ascii'?>\n<body>tãt</body>" Python 3: >>> import xml.etree.ElementTree as ET >>> e = ET.XML("<?xml version='1.0' encoding='iso-8859-1'?><body>t\xe3t</body>") >>> ET.tostring(e, 'ascii') ..... UnicodeEncodeError: 'ascii' codec can't encode characters in position 1-2: ordinal not in range(128) It looks like _encode_entity isn't ever called inside ElementTree anymore - it probably should be called as part of _encode for characters that can't be represented. ---------- components: Library (Lib) messages: 89058 nosy: Neil Muller, effbot, hodgestar severity: normal status: open title: ElementTree (py3k) doesn't properly encode characters that can't be represented in the specified encoding type: behavior versions: Python 3.0, Python 3.1 _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue6233> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com