Ezio Melotti <ezio.melo...@gmail.com> added the comment:

Here is a patch to "fix" sys_displayhook (note: the patch is just a proof of 
concept -- it seems to work fine but I still have to clean it up, add comments, 
rename and reorganize some vars and add tests).
This is an example output while using iso-8859-1 as IO encoding:

w...@linuxvm:~/dev/py3k$ PYTHONIOENCODING=iso-8859-1 ./python
Python 3.2a0 (py3k:82643:82644M, Jul  9 2010, 11:39:25)
[GCC 4.4.1] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys; sys.stdout.encoding, sys.stdin.encoding
('iso-8859-1', 'iso-8859-1')
>>> 'ascii string'
'ascii string'  # works fine
>>> 'some accented chars: öäå'
'some accented chars: öäå'  # works fine - these chars are encodable
>>> 'a snowman: \u2603'
'a snowman: \u2603'  # non-encodable - the char is escaped instead of raising 
an error
>>> 'snowman: \u2603, and accented öäå'
'snowman: \u2603, and accented öäå' # only non-encodable chars are escaped
>>> # the behavior of print is still the same:
>>> print('some accented chars: öäå') 
some accented chars: öäå
>>> print('a snowman: \u2603')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'latin-1' codec can't encode character '\u2603' in position 
11: ordinal not in range(256)

-------------------------------------

While testing the patch with PYTHONIOENCODING=iso-8859-1 I also found this 
weird issue that however is *not* related to the patch, since I managed to 
reproduce on a clean py3k using PYTHONIOENCODING=iso-8859-1:
>>> 'òàùèì  óáúéí  öäüëï'
'ò�\xa0ùèì  óáúé�\xad  öäüëï'
>>> 'òàùèì  óáúéí  öäüëï'.encode('iso-8859-1')
b'\xc3\xb2\xc3\xa0\xc3\xb9\xc3\xa8\xc3\xac  
\xc3\xb3\xc3\xa1\xc3\xba\xc3\xa9\xc3\xad  
\xc3\xb6\xc3\xa4\xc3\xbc\xc3\xab\xc3\xaf'
>>> 'òàùèì'.encode('utf-8')
b'\xc3\x83\xc2\xb2\xc3\x83\xc2\xa0\xc3\x83\xc2\xb9\xc3\x83\xc2\xa8\xc3\x83\xc2\xac'

I think there might be some conflict between the IO encoding that I specified 
and the one that my terminal actually uses, but I couldn't figure out what's 
going on exactly (it also weird that only 'à' and 'í' are not displayed 
correctly). Unless this behavior is expected I'll open another issue about it.

----------
keywords: +patch
Added file: http://bugs.python.org/file17915/issue9198.diff

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue9198>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to