En Wed, 11 Feb 2009 23:11:37 -0200, jeffg <jeffgem...@gmail.com> escribió:
On Feb 11, 6:30 pm, "Martin v. Löwis" <ma...@v.loewis.de> wrote:

> Thanks, I ended up using encode('iso-8859-15', "replace")
> Perhaps more up to date than cp1252...??
If you encode as iso-8859-15, but this is not what your terminal
expects, it certainly won't print correctly. To get correct printing,
the output encoding must be the same as the terminal encoding. If the
terminal encoding is not up to date (as you consider cp1252), then
the output encoding should not be up to date, either.
I did try UTF-8 but it produced the upper case character instead of
the proper lower case, so the output was incorrect for the unicode
supplied.
I think both 8859-15 and cp1252 produced the correct output, but I
figured 8859-15 would have additional character support (though not
sure this is the case - if it is not, please let me know and I'll use
1252).  I'm dealing with large data sets and this just happend to be
one small example.  I want to have the best ability to write future
unicode characters properly based on running from the windows command
line (unless there is a better way to do it on windows).

As Martin v. Löwis already said, the encoding used by Python when writing to the console, must match the encoding the console expects. (And you also should use a font capable of displaying such characters).

windows-1252 and iso-8859-15 are similar, but not identical. This table shows the differences (less than 30 printable characters): http://en.wikipedia.org/wiki/Western_Latin_character_sets_(computing) If your script encodes its output using iso-8859-15, the corresponding console code page should be 28605. "Western European" (whatever that means exactly) Windows versions use the windows-1252 encoding as the "Ansi code page" (GUI applications), and cp850 as the "OEM code page" (console applications) -- cp437 in the US only.

C:\Documents and Settings\Gabriel>chcp 1252
Tabla de códigos activa: 1252

C:\Documents and Settings\Gabriel>python
Python 2.6 (r26:66721, Oct 2 2008, 11:35:03) [MSC v.1500 32 bit (Intel)] on win
32
Type "help", "copyright", "credits" or "license" for more information.
py> unichr(0x0153).encode("windows-1252")
'\x9c'
py> print _
œ
py> ^Z

C:\Documents and Settings\Gabriel>chcp 28605
Tabla de códigos activa: 28605

C:\Documents and Settings\Gabriel>python
Python 2.6 (r26:66721, Oct 2 2008, 11:35:03) [MSC v.1500 32 bit (Intel)] on win
32
Type "help", "copyright", "credits" or "license" for more information.
py> unichr(0x0153).encode("iso-8859-15")
'\xbd'
py> print _
œ
py> unichr(0x0153).encode("latin9")
'\xbd'

--
Gabriel Genellina

--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to