On 7/14/2017 5:51 PM, Marko Rauhamaa wrote:
Yes, in Python2, Go, C and GNU textutils, when you print a text string
containing a mixture of languages, you see characters.
Why?
Because that's what the terminal emulator chooses to do upon receiving
those bytes.
>>> s = u'\u1171\u2222\u3333\u4444\u5555'
>>> s
u'\u1171\u2222\u3333\u4444\u5555'
>>> print(s)
ᅱ∢㌳䑄啕
>>> b = s.encode('utf-8')
>>> b
'\xe1\x85\xb1\xe2\x88\xa2\xe3\x8c\xb3\xe4\x91\x84\xe5\x95\x95'
>>> print(b)
ᅱ∢㌳䑄啕
I prefer the accurate 5 char print of the text string to the print of
the bytes.
--
Terry Jan Reedy
--
https://mail.python.org/mailman/listinfo/python-list