On 05/18/2012 02:50 AM, Steven D'Aprano wrote:

Is it normal the str.isnumeric() returns False for these Cuneiforms?
>
>  '\U00012456'
>  '\U00012457'
>  '\U00012432'
>  '\U00012433'
>
>  They are all in the Nl category.
Are you sure about that? Do you have a reference?
I I was just playing with Unicode on Python 3.3a:

>>> from unicodedata import category, name
>>> from sys import maxunicode

>>> nl = [chr(c) for c in range(maxunicode + 1) \
... if category(chr(c)).startswith('Nl')]

>>> numerics = [chr(c) for c in range(maxunicode + 1) \
... if chr(c).isnumeric()]

>>> for c in set(nl) - set(numerics):
...     print(hex(ord(c)), category(c), unicodedata.name(c))
...
0x12432 Nl CUNEIFORM NUMERIC SIGN SHAR2 TIMES GAL PLUS DISH
0x12433 Nl CUNEIFORM NUMERIC SIGN SHAR2 TIMES GAL PLUS MIN
0x12456 Nl CUNEIFORM NUMERIC SIGN NIGIDAMIN
0x12457 Nl CUNEIFORM NUMERIC SIGN NIGIDAESH

So they are in the Nl category but are not "numerics", and that sounds strange because other Cuneiforms are "numerics":
>>> '\U00012455'.isnumeric(), '\U00012456'.isnumeric()
(True, False)

It seems to me that they are not:


py>  c = '\U00012456'
py>  import unicodedata
py>  unicodedata.numeric(c)
Traceback (most recent call last):
   File "<stdin>", line 1, in<module>
ValueError: not a numeric character
Exactly, as I wrote above, is that right?

--
Marco
--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to