Andrew Svetlov <andrew.svet...@gmail.com> added the comment:

I consulted with Martin at PyCon sprint and he suggested sulution which I'm 
following — to split `print` and REPL (read-eval-print loop).

Output passed to print() function encoded with sys.stdout.encoding

UTF has been invented to support any character.
Linux usually setted up to use utf-8 encoding by default (see LANG environment 
variable). There are no encoding issues with that.

xterm (old enough terminal) which you use cannot print non-BMP characters and 
replaces it with question marks.
Modern gnome-terminal prints that symbols very well.

Let's return to non-UTF terminal encodings.
If character cannot be encoded Python throws UnicodeEncodeError.
There's example:

andrew@tiktaalik ~/p/cpython> bash -c "LANG=C; ./python"
Python 3.3.0a1+ (qbase qtip tip tk:c3ce8a8e6c9c+, Mar 14 2012, 15:54:55) 
[GCC 4.6.1] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> '\U00010340'
'\U00010340'
>>> print('\U00010340')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode character '\U00010340' in 
position 0: ordinal not in range(128)
>>> 

As you can see I have switched LANG to C (alias for ASCII) locale.

Eval printed with unicode escaping but `print` call raises error.
This happens because python's REPL calls sys.displayhook.
You can look at http://docs.python.org/dev/library/sys.html#sys.displayhook 
details. 
That code escapes unicode if terminal doesn't support it.

The same for Windows, OS X and any other platform.

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue14200>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to