Andrew Svetlov <andrew.svet...@gmail.com> added the comment: I consulted with Martin at PyCon sprint and he suggested sulution which I'm following — to split `print` and REPL (read-eval-print loop).
Output passed to print() function encoded with sys.stdout.encoding UTF has been invented to support any character. Linux usually setted up to use utf-8 encoding by default (see LANG environment variable). There are no encoding issues with that. xterm (old enough terminal) which you use cannot print non-BMP characters and replaces it with question marks. Modern gnome-terminal prints that symbols very well. Let's return to non-UTF terminal encodings. If character cannot be encoded Python throws UnicodeEncodeError. There's example: andrew@tiktaalik ~/p/cpython> bash -c "LANG=C; ./python" Python 3.3.0a1+ (qbase qtip tip tk:c3ce8a8e6c9c+, Mar 14 2012, 15:54:55) [GCC 4.6.1] on linux Type "help", "copyright", "credits" or "license" for more information. >>> '\U00010340' '\U00010340' >>> print('\U00010340') Traceback (most recent call last): File "<stdin>", line 1, in <module> UnicodeEncodeError: 'ascii' codec can't encode character '\U00010340' in position 0: ordinal not in range(128) >>> As you can see I have switched LANG to C (alias for ASCII) locale. Eval printed with unicode escaping but `print` call raises error. This happens because python's REPL calls sys.displayhook. You can look at http://docs.python.org/dev/library/sys.html#sys.displayhook details. That code escapes unicode if terminal doesn't support it. The same for Windows, OS X and any other platform. ---------- _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue14200> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com