On 7/24/2011 11:15 AM, Joao Jacome wrote:
http://pastebin.com/aMrzczt4

        list = os.listdir(dir)
While somewhat natural, using 'list' as a local name and masking the builtin list function is a *very bad* idea. Someday you will do this and then use 'list(args)' expecting to call the list function, and it will not work.

When the script reaches a file with latin characters (ê é ã etc) it crashes.

Traceback (most recent call last):
   File "C:\backup\ORGANI~1\teste.py", line 37, in <module>
     Retrieve(rootdir);
   File "C:\backup\ORGANI~1\teste.py", line 25, in Retrieve
     Retrieve(os.path.join(dir,filename))
   File "C:\backup\ORGANI~1\teste.py", line 18, in Retrieve
     print l
   File "C:\Python27\lib\encodings\cp850.py", line 12, in
ejavascript:void(0);ncode
     return codecs.charmap_encode(input,errors,encoding_map)
UnicodeEncodeError: 'charmap' codec can't encode character u'\x8a' in
position 4
3: character maps to <undefined>

'\x8a' *is* the cp850 encoded byte for reverse accent e: è
But your program treats is a unicode value, where it is a control char (Line Tabulation Set), and tries to encode it to cp850, which is not possible.

I suspect this has something to do with defining the rootdir as a unicode string: rootdir = u"D:\\ghostone"
Perhaps if you removed the 'u', your program would work.
Or perhaps you should explicitly decode the values in os.listdir(dir) before joining them to the rootdir and re-encoding.

This sort of thing sometimes works better with Python 3.

Does someone knows how to fix this?

Thank you!

João Victor Sousa Jácome



--
Terry Jan Reedy


--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to