Hi group, I just came across the following exception:
#v+ $ python Python 2.4.2 (#2, Sep 30 2005, 21:19:01) [GCC 4.0.2 20050808 (prerelease) (Ubuntu 4.0.1-4ubuntu8)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import unicodedata >>> u'\N{LATIN LETTER SMALL CAPITAL BARRED B}' UnicodeDecodeError: 'unicodeescape' codec can't decode bytes in position 0-38: unknown Unicode character name >>> unicodedata.name(u'\u1d03') Traceback (most recent call last): File "<stdin>", line 1, in ? ValueError: no such name >>> ^D $ #v- When checking unicodedata.name() against each uchar in the file /usr/share/unidata/UnicodeData-4.0.1d1b.txt that came with the console-data package on my Ubuntu Linux installation a total of 1226 unicode characters seems to be missing from the unicodedata module (2477 missing characters when checking against the latest database from unicode.org¹). Is this a deliberate omission? Cheers, Klaus. ¹) http://www.unicode.org/Public/UNIDATA/UnicodeData.txt -- Klaus Alexander Seistrup SubZeroNet, Copenhagen, Denmark http://magnetic-ink.dk/ -- http://mail.python.org/mailman/listinfo/python-list