[issue9692] UnicodeDecodeError in ElementTree.tostring()
New submission from Ulrich Seidl : The following code leads to an UnicodeError in python 2.7 while it works fine in 2.6 & 2.5: # -*- coding: latin-1 -*- import xml.etree.cElementTree as ElementTree oDoc = ElementTree.fromstring( '' ) oDoc.set( "ATTR", "ÄÖÜ" ) print ElementTree.tostring( oDoc , encoding="iso-8859-1" ) -- components: XML messages: 114980 nosy: uis priority: normal severity: normal status: open title: UnicodeDecodeError in ElementTree.tostring() versions: Python 2.7 ___ Python tracker <http://bugs.python.org/issue9692> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue9692] UnicodeDecodeError in ElementTree.tostring()
Ulrich Seidl added the comment: Of course, if you use an unicode string it works and of course it would be easy to switch to unicode for this demo code. Unfortunately, the affected application is a little bit more complex and it is not that easy to switch to unicode. I just wonder why the tostring() method does not assume that internal strings are encoded in the explicitly provided encoding? Is ElementTree restricted to the use of unicode strings? Anyway, why was it working (as expected) with python 2.5 & python 2.6? -- ___ Python tracker <http://bugs.python.org/issue9692> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue9692] UnicodeDecodeError in ElementTree.tostring()
Ulrich Seidl added the comment: Well, the output of the print is not that interesting as long as ElementTree is able the restore the former attributes value when reading it in again. The print was just used to illustrate that an UnicodeDecodeError appears. Think about doing an ElementTree.fromstring( ... ).get( "ATTR" ).encode( "iso-8859-1" ). -- ___ Python tracker <http://bugs.python.org/issue9692> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue9692] UnicodeDecodeError in ElementTree.tostring()
Ulrich Seidl added the comment: I would suggest adding an additional except branch to (at least) the following functions of ElementTree.py: * _encode, * _escape_attrib, and * _escape_cdata The except branch could look like: except (UnicodeDecodeError): return text.decode( encoding ).encode( encoding, "xmlcharrefreplace") -- ___ Python tracker <http://bugs.python.org/issue9692> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12931] xmlrpclib confuses unicode and string
Ulrich Seidl added the comment: The change set committed for 2.7 introduces another problem. At the beginning of xmlrpclib.py, there is an explicit test for the availability of unicode: try: unicode except NameError: unicode = None # unicode support not available In case unicode was set to None, a TypeError: isinstance() arg 2 must be a class, type, or tuple of classes and types will be raised by the code introduced to ServerProxy: if isinstance(uri, unicode): uri = uri.encode('ISO-8859-1') -- nosy: +uis ___ Python tracker <http://bugs.python.org/issue12931> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com