On 25 January 2014 04:37, Steven D'Aprano <steve+comp.lang.pyt...@pearwood.info> wrote: > > But using Python 2.7, I get a really bad case of moji-bake: > > [steve@ando ~]$ python2.7 -c "print u'ñøλπйж'" > ñøλÏйж > > However, interactively it works fine: > > [steve@ando ~]$ python2.7 -E > Python 2.7.2 (default, May 18 2012, 18:25:10) > [GCC 4.1.2 20080704 (Red Hat 4.1.2-52)] on linux2 > Type "help", "copyright", "credits" or "license" for more information. >>>> print u'ñøλπйж' > ñøλπйж > > This occurs on at least two different machines, one using Centos and the > other Debian.
Same for me. It's to do with using a u literal: $ python2.7 -c "print('ñøλπйж')" ñøλπйж $ python2.7 -c "print(u'ñøλπйж')" ñøλπйж $ python2.7 -c "print(repr('ñøλπйж'))" '\xc3\xb1\xc3\xb8\xce\xbb\xcf\x80\xd0\xb9\xd0\xb6' $ python2.7 -c "print(repr(u'ñøλπйж'))" u'\xc3\xb1\xc3\xb8\xce\xbb\xcf\x80\xd0\xb9\xd0\xb6' $ python2.7 Python 2.7.5+ (default, Sep 19 2013, 13:49:51) [GCC 4.8.1] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> b='\xc3\xb1\xc3\xb8\xce\xbb\xcf\x80\xd0\xb9\xd0\xb6' >>> print(b) ñøλπйж >>> s=u'\xc3\xb1\xc3\xb8\xce\xbb\xcf\x80\xd0\xb9\xd0\xb6' >>> print(s) ñøλπйж >>> print(s.encode('latin-1')) ñøλπйж >>> import sys >>> sys.getdefaultencoding() 'ascii' It works in the interactive prompt: >>> s = 'ñøλπйж' >>> print(s) ñøλπйж >>> s = u'ñøλπйж' >>> print(s) ñøλπйж But the interactive prompt has an associated encoding: >>> import sys >>> sys.stdout.encoding 'UTF-8' If I put it into a utf-8 file with no encoding declared I get a SyntaxError: $ cat tmp.py s = u'ñøλπйж' print(s) oscar@tonis-laptop:~$ python2.7 tmp.py File "tmp.py", line 1 SyntaxError: Non-ASCII character '\xc3' in file tmp.py on line 1, but no encoding declared; see http://www.python.org/peps/pep-0263.html for details If I add the encoding declaration it works: oscar@tonis-laptop:~$ vim tmp.py oscar@tonis-laptop:~$ cat tmp.py # -*- coding: utf-8 -*- s = u'ñøλπйж' print(s) oscar@tonis-laptop:~$ python2.7 tmp.py ñøλπйж oscar@tonis-laptop:~$ So I'd say that your original example should be a SyntaxError with Python 2.7 but instead it implicitly uses latin-1. Oscar -- https://mail.python.org/mailman/listinfo/python-list