[issue5127] UnicodeEncodeError - I can't even see license

Ezio Melotti Tue, 03 Feb 2009 04:26:39 -0800

Ezio Melotti <ezio.melo...@gmail.com> added the comment:

FWIW, on Python3 it seems to work:
>>> import unicodedata
>>> unicodedata.category("\U00010000")
'Lo'
>>> unicodedata.category("\U00011000")
'Cn'
>>> unicodedata.category(chr(0x10000))
'Lo'
>>> unicodedata.category(chr(0x11000))
'Cn'
>>> ord(chr(0x10000)), 0x10000
(65536, 65536)
>>> ord(chr(0x11000)), 0x11000
(69632, 69632)


I'm using a narrow build too:
>>> import sys
>>> sys.maxunicode
65535
>>> len('\U00010000')
2
>>> ord('\U00010000')
65536

On Python2 unichr() is supposed to raise a ValueError on a narrow build
if the value is greater than 0xFFFF [1], but if the characters above
0xFFFF can be represented with u"\Uxxxxxxxx" there should be a way to
fix unichr so it can return them. Python3 already does it with chr().

Maybe we should open a new issue for this if it's not present already.

[1]: http://docs.python.org/library/functions.html#unichr

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue5127>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue5127] UnicodeEncodeError - I can't even see license

Reply via email to