New submission from monson <holymon...@gmail.com>:

Python 3.0 introduces additional characters from outside the ASCII range (see 
PEP 3131). see 
https://docs.python.org/3/reference/lexical_analysis.html#identifiers

But lib2to3 can't tokenize them corretly.
```
$ echo '中 = 1' | python3.7 -m lib2to3.pgen2.tokenize
1,0-1,1:        ERRORTOKEN      '中'
1,2-1,3:        OP      '='
1,4-1,5:        NUMBER  '1'
1,5-1,6:        NEWLINE '\n'
2,0-2,0:        ENDMARKER       ''
```
'中' should be tokenized as NAME instead of ERRORTOKEN.

----------
components: Library (Lib)
messages: 324148
nosy: monson
priority: normal
severity: normal
status: open
title: lib2to3: support non-ASCII identifiers
versions: Python 3.7

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue34515>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to