R. David Murray added the comment: To understand why, understand that a byte string has no encoding inherent. So when you call b'utf8string'.decode('unicode_escape'), python has no way to know how to interpret the non-ascii characters in that bytestring. If you want the unicode_escape representation of something, you want to do 'string'.encode('unicode_escape'). If you then want that as a python string, you can do:
'mystring'.encode('unicode_escape').decode('ascii') In theory there ought to be a way to use the codecs module to go directly from unicode string to unicode-escaped string, but I don't know how to do it, since the proposal for the 'transform' method was rejected :) Just to bend your brain a bit further, note that this does work: >>> codecs.decode(codecs.encode('ä', 'unicode-escape').decode('ascii'), >>> 'unicode-escape') 'ä' ---------- _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue21331> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com