Kurt B. Kaiser <k...@shore.net> added the comment:

Tcl/Tk uses modified utf-8 internally.  This includes using 0xC080, a multibyte 
Unicode null character, for embedded nulls that work with C's null terminated 
strings.  Java does the same.

Note that typing Ctrl-space and Ctrl-2 are conventional ways to enter a null 
from the keyboard.  That's the reason a null char is associated with those key 
combinations.

When Tcl exports Unicode, it is supposed to be strict utf-8.  Until Tcl8.5, the 
%A (Unicode character corresponding to an event) was incorrectly leaking the 
modified Unicode null.

_tkinter.c.2.patch is narrowly focused: if PythonCmd raises a 
UnicodeDecodeError and if the string passed in an arg is 0xC080, it is replaced 
with the Unicode null 0x00.

----------
assignee: ned.deily -> kbk
components: +Unicode
nosy: +kbk
resolution:  -> accepted
Added file: http://bugs.python.org/file21954/tkinter.c.2.patch

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue1028>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to