Steven D'Aprano wrote: > I have an unexpected display error when dealing with Unicode strings, and > I cannot understand where the error is occurring. I suspect it's not > actually a Python issue, but I thought I'd ask here to start.
I suppose it is a Python issue -- where Python fails to guess an encoding it usually falls back to ascii. > But using Python 2.7, I get a really bad case of moji-bake: > > [steve@ando ~]$ python2.7 -c "print u'ñøλπйж'" > ñøλÏйж > > > However, interactively it works fine: > > [steve@ando ~]$ python2.7 -E > Python 2.7.2 (default, May 18 2012, 18:25:10) > [GCC 4.1.2 20080704 (Red Hat 4.1.2-52)] on linux2 > Type "help", "copyright", "credits" or "license" for more information. >>>> print u'ñøλπйж' > ñøλπйж You can provoke it with exec: >>> exec "print u'ñøλπйж'" ñøλÏйж >>> exec u"print u'ñøλπйж'" ñøλπйж >>> exec "# -*- coding: utf-8 -*-\nprint u'ñøλπйж'" ñøλπйж > This occurs on at least two different machines, one using Centos and the > other Debian. > > Anyone have any idea what's going on? I can replicate the display error > using Python 3 like this: > > py> s = 'ñøλπйж' > py> print(s.encode('utf-8').decode('latin-1')) > ñøλÏйж > > but I'm not sure why it's happening at the command line. Anyone have any > ideas? It is probably burried in the C code -- after a few indirections I lost track :( -- https://mail.python.org/mailman/listinfo/python-list