[issue12567] curses implementation of Unicode is wrong in Python 3

STINNER Victor Thu, 14 Jul 2011 15:33:43 -0700

New submission from STINNER Victor <[email protected]>:

curses functions accepting strings encode implicitly character strings to 
UTF-8. This is wrong. We should add a function to set the encoding (see issue 
#6745) or use the wide character C functions. I don't think that UTF-8 is the 
right default encoding, I suppose that the locale encoding is a better choice.


Accepting characters (and character strings) but calling byte functions is 
wrong. For example, addch('é') doesn't work with UTF-8 locale encoding. It 
calls waddch(0xE9) (é is U+00E9), whereas waddch(0xC3)+waddch(0xA9) should be 
called. Workaround in Python:

    for byte in 'é'.encode('utf-8'):
        win.addch(byte)

I see two possible solutions:

A) Add a new functions only accepting characters, and not accept characters in 
the existing functions

B) The function should be fixed to call the right C function depending on the 
input type. For example, Python addch(10) and addch(b'\n') would call 
waddch(10), whereas addch('é') would call wadd_wch(233).

I prefer solution (B) because addch('é') would just work as expected.

----------
components: Library (Lib)
messages: 140375
nosy: Nicholas.Cole, akuchling, cben, gpolo, haypo, inigoserna, python-dev, 
r.david.murray, schodet, zeha
priority: normal
severity: normal
status: open
title: curses implementation of Unicode is wrong in Python 3
versions: Python 3.3

_______________________________________
Python tracker <[email protected]>
<http://bugs.python.org/issue12567>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue12567] curses implementation of Unicode is wrong in Python 3

Reply via email to