"Amos Anderson" <amosander...@gmail.com> wrote in message news:a073a9cf0906242007k5067314dn8e9d7b1c6da62...@mail.gmail.com...
I've run into a bit of an issue iterating through files in python 3.0 and
3.1rc2. When it comes to a files with '\u200b' in the file name it gives the
error...

Traceback (most recent call last):
 File "ListFiles.py", line 19, in <module>
   f.write("file:{0}\n".format(i))
 File "c:\Python31\lib\encodings\cp1252.py", line 19, in encode
   return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\u200b' in
position
30: character maps to <undefined>

Code is as follows...
import os
f = open("dirlist.txt", 'w')

for root, dirs, files in os.walk("C:\\Users\\Filter\\"):
   f.write("root:{0}\n".format(root))
   f.write("dirs:\n")
   for i in dirs:
       f.write("dir:{0}\n".format(i))
   f.write("files:\n")
   for i in files:
       f.write("file:{0}\n".format(i))
f.close()
input("done")

The file it's choking on happens to be a link that internet explorer
created. There are two files that appear in explorer to have the same name
but one actually has a zero width space ('\u200b') just before the .url
extension. In playing around with this I've found several files with the
same character throughout my file system. OS: Vista SP2, Language: US
English.

Am I doing something wrong or did I find a bug? It's worth noting that
Python 2.6 just displays this character as a ? just as it appears if you
type dir at the windows command prompt.

In Python 3.x strings default to Unicode. Unless you choose an encoding, Python will use the default system encoding to encode the Unicode strings into a file. On Windows, the filesystem uses Unicode and supports the full character set, but cp1252 (on your system) is the default text file encoding, which doesn't support zero-width space. Specify an encoding for the output file such as UTF-8:

f=open('blah.txt','w',encoding='utf8')
f.write('\u200b')
1
f.close()

-Mark


--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to