New submission from Thomas Kluyver:

The docs describe the NL token as "Token value used to indicate a 
non-terminating newline. The NEWLINE token indicates the end of a logical line 
of Python code; NL tokens are generated when a logical line of code is 
continued over multiple physical lines."

However, after a comment or a blank line, tokenize emits NL, even when it's not 
inside a multi-line statement. For example:

In [15]: for tok in tokenize.generate_tokens(StringIO('#comment\n').readline):  
print(tok)
TokenInfo(type=54 (COMMENT), string='#comment', start=(1, 0), end=(1, 8), 
line='#comment\n')
TokenInfo(type=55 (NL), string='\n', start=(1, 8), end=(1, 9), 
line='#comment\n')
TokenInfo(type=0 (ENDMARKER), string='', start=(2, 0), end=(2, 0), line='')

This makes it difficult to use tokenize to detect multi-line statements, as we 
want to do in IPython.

In my tests so far, changing two instances of NL to NEWLINE in this block 
(lines 530 & 533) makes it behave as I expect:
http://hg.python.org/cpython/file/a375c3d88c7e/Lib/tokenize.py#l524

----------
messages: 180846
nosy: takluyver
priority: normal
severity: normal
status: open
title: tokenize unconditionally emits NL after comment lines & blank lines
versions: Python 2.6, Python 2.7, Python 3.2, Python 3.3

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue17061>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to