[issue23297] Clarify error when ‘tokenize.detect_encoding’ receives text

Berker Peksag Sun, 28 Apr 2019 06:08:16 -0700


Berker Peksag <berker.pek...@gmail.com> added the comment:


The original problem has already been solved by making 
tokenize.generate_tokens() public in issue 12486.

However, the same exception can be raised when tokenize.open() is used with 
tokenize.tokenize(), because it returns a text stream:

    
https://github.com/python/cpython/blob/da63b321f63b697f75e7ab2f88f55d907f56c187/Lib/tokenize.py#L396

hello.py
--------

def say_hello():
    print("Hello, World!")

say_hello()


text.py
-------

import tokenize

with tokenize.open('hello.py') as f:
    token_gen = tokenize.tokenize(f.readline)
    for token in token_gen:
        print(token)

When we pass f.readline to tokenize.tokenize(), the second call to 
detect_encoding() fails, because f.readline() returns str.

In Lib/test/test_tokenize.py, it seems like tokenize.open() is only tested to 
open a file. Its output isn't passed to tokenize.tokenize(). Most of the tests 
either pass the readline() method of open(..., 'rb') or io.BytesIO() to 
tokenize.tokenize().

I will submit a documentation PR that suggests to use 
tokenize.generate_tokens() with tokenize.open().

----------
assignee:  -> docs@python
components: +Documentation
nosy: +docs@python
versions: +Python 3.7, Python 3.8 -Python 3.5, Python 3.6

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue23297>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue23297] Clarify error when ‘tokenize.detect_encoding’ receives text

Reply via email to