I've run into a bit of an issue iterating through files in python 3.0 and 3.1rc2. When it comes to a files with '\u200b' in the file name it gives the error...
Traceback (most recent call last): File "ListFiles.py", line 19, in <module> f.write("file:{0}\n".format(i)) File "c:\Python31\lib\encodings\cp1252.py", line 19, in encode return codecs.charmap_encode(input,self.errors,encoding_table)[0] UnicodeEncodeError: 'charmap' codec can't encode character '\u200b' in position 30: character maps to <undefined> Code is as follows... import os f = open("dirlist.txt", 'w') for root, dirs, files in os.walk("C:\\Users\\Filter\\"): f.write("root:{0}\n".format(root)) f.write("dirs:\n") for i in dirs: f.write("dir:{0}\n".format(i)) f.write("files:\n") for i in files: f.write("file:{0}\n".format(i)) f.close() input("done") The file it's choking on happens to be a link that internet explorer created. There are two files that appear in explorer to have the same name but one actually has a zero width space ('\u200b') just before the .url extension. In playing around with this I've found several files with the same character throughout my file system. OS: Vista SP2, Language: US English. Am I doing something wrong or did I find a bug? It's worth noting that Python 2.6 just displays this character as a ? just as it appears if you type dir at the windows command prompt.
-- http://mail.python.org/mailman/listinfo/python-list