R. David Murray added the comment:
The docs could certainly be more explicit...currently they state that tokenize
is *detecting* the encoding of the file, which *implies* but does not make
explicit that the input must be binary, not text.
The doc problem will get fixed as part of the fix to is
New submission from Tyler Crompton:
Line 402 in lib/python3.3/tokenize.py, contains the following line:
if first.startswith(BOM_UTF8):
BOM_UTF8 is a bytes object. str.startswith does not accept bytes objects. I was
able to use tokenize.tokenize only after making the following changes:
Cha