Matthew Barnett added the comment:

In Unicode 9.0.0, U+1885 and U+1886 changed from being 
General_Category=Other_Letter (Lo) to General_Category=Nonspacing_Mark (Mn).

U+2118 is General_Category=Math_Symbol (Sm) and U+212E is 
General_Category=Other_Symbol (So).

\w doesn't include Mn, Sm or So.

The .identifier method uses the Unicode properties XID_Start and XID_Continue, 
which include these codepoints.

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue30838>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to