Matthew Barnett added the comment: In Unicode 9.0.0, U+1885 and U+1886 changed from being General_Category=Other_Letter (Lo) to General_Category=Nonspacing_Mark (Mn).
U+2118 is General_Category=Math_Symbol (Sm) and U+212E is General_Category=Other_Symbol (So). \w doesn't include Mn, Sm or So. The .identifier method uses the Unicode properties XID_Start and XID_Continue, which include these codepoints. ---------- _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue30838> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com