New submission from Marc-Andre Lemburg: In Python 2, the unicode() constructor does not accept bytes arguments, unless an encoding argument is given:
>>> unicode(u'abcäöü'.encode('utf-8')) Traceback (most recent call last): File "<stdin>", line 1, in <module> UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 3: ordinal not in range(128) In Python 3, the str() constructor masks this programming error by returning the repr() of the bytes object: >>> str('abcäöü'.encode('utf-8')) "b'abc\\xc3\\xa4\\xc3\\xb6\\xc3\\xbc'" I think it would be more helpful to point the programmer to the most probably missing encoding argument by raising an error. Also note that you get a different output with encoding argument set: >>> str('abcäöü'.encode('utf-8'), 'utf-8') 'abcäöü' I know this is documented, but it is still not very helpful and can easily hide errors. ---------- components: Interpreter Core, Unicode messages: 241800 nosy: ezio.melotti, haypo, lemburg priority: normal severity: normal status: open title: str(bytes_obj) should raise an error versions: Python 3.5, Python 3.6 _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue24025> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com