sam wrote: > I have a byte stream read over the internet: > > responseByteStream = urllib.request.urlopen( httpRequest ); > responseByteArray = responseByteStream.read(); > > The characters are encoded with unicode escape sequences, for example > a copyright symbol appears in the stream as the bytes: > > 5C 75 30 30 61 39 > > which translates to: > \u00a9 > > which is unicode for the copyright symbol. > > I am simply trying to display this copyright symbol on a webpage, so > how do I encode the byte array to utf-8 given that it is 'escape > encoded' in the above way? I tried: > > responseByteArray.decode('utf-8') > and responseByteArray.decode('unicode_escape') > and str(responseByteArray). > > I am using Python 3.1.
Convert the bytes to unicode first: >>> u = b"\\u00a9".decode("unicode-escape") >>> u '©' Then convert the string to bytes: >>> u.encode("utf-8") b'\xc2\xa9' -- http://mail.python.org/mailman/listinfo/python-list