New submission from Matt Bachmann: PEP 3131 changed the definition of valid identifiers to match this pattern
<XID_Start> <XID_Continue>* . Currently if you have an invalid character in an identifier you get this error ☺ = 4 SyntaxError: invalid character in identifier This is fine in most cases. But in some cases the problem is not the character is invalid so much as the character may not be used to START the identifier. One example of this is the "combining grave accent" which is an XID_CONTINUE character but not an XID_START So ̀e is an invalid identifier but è is a valid identifier. So the ̀ character is not invalid in all cases. The attached patch attempts to clarify this by providing a different error when the start character is invalid. >>> ̀e = 4 File "<stdin>", line 1 ̀e = 4 ^ SyntaxError: invalid start character in identifier However, if the character is simply not allowed (as it is neither an XID_START or an XID_CONTINUE character) the original error is used. >>> ☺smile = 4 File "<stdin>", line 1 ☺smile = 4 ^ SyntaxError: invalid character in identifier ---------- components: Unicode files: clarify_unicode_identifier_errors.patch keywords: patch messages: 234222 nosy: Matt.Bachmann, ezio.melotti, haypo priority: normal severity: normal status: open title: Python 3 gives misleading errors when validating unicode identifiers type: enhancement versions: Python 3.2, Python 3.3, Python 3.4, Python 3.5, Python 3.6 Added file: http://bugs.python.org/file37755/clarify_unicode_identifier_errors.patch _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue23263> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com