On Fri, Dec 18, 2015 at 5:36 PM, Terry Reedy <tjre...@udel.edu> wrote: > Last I knew, Guido still wanted stdlib files to be all-ascii, especially > possibly in special cases. There is no good reason I can think of for there > to be an invisible non-ascii space in a comment. It strikes me as most > likely an accident (typo) that should be fixed. I suspect the same of most > of the following. Perhaps you should file an issue (and patch?) on the > tracker.
You're probably right on that one. Here's others - and the script I used to find them. import os for root, dirs, files in os.walk("."): if "test" in root: continue for fn in files: if not fn.endswith(".py"): continue if "test" in fn: continue with open(os.path.join(root,fn),"rb") as f: for l,line in enumerate(f): try: line.decode("ascii") continue # Ignore the ASCII lines except UnicodeDecodeError: line = line.rstrip(b"\n") try: line = line.decode("UTF-8") except UnicodeDecodeError: line = repr(line) # If it's not UTF-8 either, show it as b'...' print("%s:%d: %s" % (fn,l,line)) shlex.py:37: self.wordchars += ('ßàáâãäåæçèéêëìíîïðñòóôõöøùúûüýþÿ' shlex.py:38: 'ÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖØÙÚÛÜÝÞ') functools.py:7: # and Łukasz Langa <lukasz at langa.pl>. heapq.py:34: [explanation by François Pinard] getopt.py:21: # Peter Åstrand <astr...@lysator.liu.se> added gnu_getopt(). sre_compile.py:26: (0x69, 0x131), # iı sre_compile.py:28: (0x73, 0x17f), # sſ sre_compile.py:30: (0xb5, 0x3bc), # µμ sre_compile.py:32: (0x345, 0x3b9, 0x1fbe), # \u0345ιι sre_compile.py:34: (0x390, 0x1fd3), # ΐΐ sre_compile.py:36: (0x3b0, 0x1fe3), # ΰΰ sre_compile.py:38: (0x3b2, 0x3d0), # βϐ sre_compile.py:40: (0x3b5, 0x3f5), # εϵ sre_compile.py:42: (0x3b8, 0x3d1), # θϑ sre_compile.py:44: (0x3ba, 0x3f0), # κϰ sre_compile.py:46: (0x3c0, 0x3d6), # πϖ sre_compile.py:48: (0x3c1, 0x3f1), # ρϱ sre_compile.py:50: (0x3c2, 0x3c3), # ςσ sre_compile.py:52: (0x3c6, 0x3d5), # φϕ sre_compile.py:54: (0x1e61, 0x1e9b), # ṡẛ sre_compile.py:56: (0xfb05, 0xfb06), # ſtst punycode.py:2: Written by Martin v. Löwis. koi8_t.py:2: # http://ru.wikipedia.org/wiki/КОИ-8 __init__.py:0: # Copyright (C) 2005 Martin v. Löwis client.py:737: a Date representing the file’s last-modified time, a client.py:739: containing a guess at the file’s type. See also the bdist_msi.py:0: # Copyright (C) 2005, 2006 Martin von Löwis connection.py:399: # Issue # 20540: concatenate before sending, to avoid delays due message.py:531: filename=('utf-8', '', Fußballer.ppt')) message.py:533: filename='Fußballer.ppt')) request.py:181: * geturl() — return the URL of the resource retrieved, commonly used to request.py:184: * info() — return the meta-information of the page, such as headers, in the request.py:188: * getcode() – return the HTTP status code of the response. Raises URLError dbapi2.py:2: # Copyright (C) 2004-2005 Gerhard Häring <g...@ghaering.de> __init__.py:2: # Copyright (C) 2005 Gerhard Häring <g...@ghaering.de> They're nearly all comments. A few string literals. I would be inclined to ASCIIfy the apostrophes, dashes, and the connection.py space that started this thread. People's names, URLs, and demonstrative characters I'm more inclined to leave. Agreed? ChrisA -- https://mail.python.org/mailman/listinfo/python-list