Harrison Chudleigh wrote: > While working on a program, I ran into an error with the usage of the > module tokenize. The following message was displayed. > File > "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/tokenize.py", > line 467, in tokenize > encoding, consumed = detect_encoding(readline) > File > "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/tokenize.py", > line 409, in detect_encoding > if first.startswith(BOM_UTF8): > TypeError: startswith first arg must be str or a tuple of str, not bytes > Undaunted, I changed the error on line 409. The line then read: > > if first.startswith(BOM_UTF8):
As Steven says -- don't change the standard library. Your problem is likely that you are opening the file containing the code you want to tokenize in text mode. Compare: $ cat 42.py 42 $ python3 Python 3.4.3 (default, Oct 14 2015, 20:28:29) [GCC 4.8.4] on linux Type "help", "copyright", "credits" or "license" for more information. >>> import tokenize First with the file opened in text mode: >>> with open("42.py", "r") as f: ... for t in tokenize.tokenize(f.readline): print(t) ... Traceback (most recent call last): File "<stdin>", line 2, in <module> File "/usr/lib/python3.4/tokenize.py", line 468, in tokenize encoding, consumed = detect_encoding(readline) File "/usr/lib/python3.4/tokenize.py", line 408, in detect_encoding if first.startswith(BOM_UTF8): TypeError: startswith first arg must be str or a tuple of str, not bytes Now let's switch to binary mode: >>> with open("42.py", "rb") as f: ... for t in tokenize.tokenize(f.readline): print(t) ... TokenInfo(type=56 (ENCODING), string='utf-8', start=(0, 0), end=(0, 0), line='') TokenInfo(type=2 (NUMBER), string='42', start=(1, 0), end=(1, 2), line='42\n') TokenInfo(type=4 (NEWLINE), string='\n', start=(1, 2), end=(1, 3), line='42\n') TokenInfo(type=0 (ENDMARKER), string='', start=(2, 0), end=(2, 0), line='') -- https://mail.python.org/mailman/listinfo/python-list