I created the following filename in windows just as a test - “Dönåld’s™ Néphêws” deg°.txt The quotes are non -ascii, many non english characters, long hyphen etc.
Now in DOS I can do a directory and it translates them all to something close. "Dönåld'sT Néphêws" deg°.txt I thought the correct way to do this in python would be to scan the dir files=os.listdir(os.path.dirname( os.path.realpath( __file__ ) )) then print the filenames for filename in files: print filename but as expected teh filename is not correct - so correct it using the file sysytems encoding print filename.decode(sys.getfilesystemencoding()) But I get UnicodeEncodeError: 'charmap' codec can't encode character u'\u2014' in position 6: character maps to <undefined> All was working well till these characters came along I need to be able to write (a representation) to the screen (and I don't see why I should not get something as good as DOS shows). Write it to an XML file in UTF-8 and write it to a text file and be able to read it back in. Again I was supprised that this was also difficult - it appears that the file also wanted ascii. Should I have to open the file in binary for write (I expect so) but then what encoding should I write in? I have been beating myself up with this for weeks as I get it working then come across some outher character that causes it all to stop again. Please help. -- http://mail.python.org/mailman/listinfo/python-list