Re: Printing Filenames with non-Ascii-Characters

aurora Tue, 01 Feb 2005 12:00:05 -0800

On Tue, 01 Feb 2005 20:28:11 +0100, Marian Aldenhövel <[EMAIL PROTECTED]> wrote:

Hi,
I am very new to Python and have run into the following problem. If I do
something like
   dir = os.listdir(somepath)
   for d in dir:
      print d
                
The program fails for filenames that contain non-ascii characters.
   'ascii' codec can't encode characters in position 33-34:
I have noticed that this seems to be a very common problem. I have read a lot of postings regarding it but not really found a solution. Is there a simple one?


English windows command prompt uses cp437 charset. To print it, use

  print d.encode('cp437')

The issue is a terminal only understand certain character set. If you have unicode string, like d in your case, you have to encode it before it can be printed. (We really need native unicode terminal!!!) If you don't encode, Python will do it for you. The default encoding is ASCII. Any string that contains non-ASCII character will give you trouble. In my opinion Python is too conversative to use the 'strict' encoding which gives users unaware of unicode a lot of woes.

So how did you get a unicoded d to start with? If 'somepath' is unicode, os.listdir returns a list of unicode. So why is somepath unicode? Either you have entered a unicode literal or it comes from some other sources. One possible source is XML parser, which returns string in unicode.

Windows NT support unicode filename. I'm not sure about Linux. The result maybe slightly differ.

What I specifically do not understand is why Python wants to interpret the string as ASCII at all. Where is this setting hidden?
I am running Python 2.3.4 on Windows XP and I want to run the program on
Debian sarge later.
Ciao, MM


--
http://mail.python.org/mailman/listinfo/python-list

Re: Printing Filenames with non-Ascii-Characters

Reply via email to