R. David Murray <rdmur...@bitdance.com> added the comment:

I agree that having a unicode API for tokenize seems to make sense, and that 
would indeed require a separate issue.

That's a good point about doctest not otherwise supporting coding cookies.  
Those only really apply to source files.  So no doctest fragments ought to 
contain coding cookies at the start, so your patch ought to be fine.  But I'm 
not familiar with the doctest internals, so having some tests to prove 
everything is fine would be great.

Your code could use the tokenize sniffer to make sure the fragment reads as 
utf-8 and throw an error otherwise.  But using a unicode interface to tokenize 
would probably be cleaner, since I suspect it would mimic what doctest does 
otherwise (ignore coding cookies).  But I don't *know* the latter, so your 
checking it would be appreciated.

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue11909>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to