R. David Murray added the comment: The docs could certainly be more explicit...currently they state that tokenize is *detecting* the encoding of the file, which *implies* but does not make explicit that the input must be binary, not text.
The doc problem will get fixed as part of the fix to issue 12486, so I'm closing this as a duplicate. If you want to help out with a patch review and doc patch suggestions on that issue, that would be great. ---------- nosy: +r.david.murray resolution: -> duplicate stage: -> committed/rejected status: open -> closed superseder: -> tokenize module should have a unicode API _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue17125> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com