R. David Murray <rdmur...@bitdance.com> added the comment: I agree that having a unicode API for tokenize seems to make sense, and that would indeed require a separate issue.
That's a good point about doctest not otherwise supporting coding cookies. Those only really apply to source files. So no doctest fragments ought to contain coding cookies at the start, so your patch ought to be fine. But I'm not familiar with the doctest internals, so having some tests to prove everything is fine would be great. Your code could use the tokenize sniffer to make sure the fragment reads as utf-8 and throw an error otherwise. But using a unicode interface to tokenize would probably be cleaner, since I suspect it would mimic what doctest does otherwise (ignore coding cookies). But I don't *know* the latter, so your checking it would be appreciated. ---------- _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue11909> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com