Eryk Sun added the comment:

Character names are in field 1 of UnicodeData.txt [1][2]. For controls the name 
is just "<control>". In Tools/unicode/makunicodedata.py, the makeunicodename 
function skips names that start with "<". Instead of skipping the character, it 
could fall back on the Unicode 1.0 name (field 10), if it's defined. For 
controls, this is the ISO 6429 name:

    (10) Old name as published in Unicode 1.0 or ISO 6429 names 
    for control functions. This field is empty unless it is 
    significantly different from the current name for the 
    character. No longer used in code chart production. See 
    Name_Alias. 

The names of control characters are also in NameAliases.txt, which gets 
processed as the unicode.aliases list of (name, char) tuples.

[1]: http://www.unicode.org/reports/tr44/#UnicodeData.txt
[2]: http://www.unicode.org/Public/8.0.0/ucd

----------
nosy: +eryksun

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue27496>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to