Re: Python unicode and Windows cmd.exe

2010-03-14 Thread Alf P. Steinbach
* Mark Tolonen: "Terry Reedy" wrote in message news:hnjkuo$n1...@dough.gmane.org... On 3/14/2010 4:40 PM, Guillermo wrote: Adding the byte that some call a 'utf-8 bom' makes the file an invalid utf-8 file. Not true. From http://unicode.org/faq/utf_bom.html: Q: When a BOM is used, is it o

Re: Python unicode and Windows cmd.exe

2010-03-14 Thread Mark Tolonen
"Terry Reedy" wrote in message news:hnjkuo$n1...@dough.gmane.org... On 3/14/2010 4:40 PM, Guillermo wrote: Adding the byte that some call a 'utf-8 bom' makes the file an invalid utf-8 file. Not true. From http://unicode.org/faq/utf_bom.html: Q: When a BOM is used, is it only in 16-bit Uni

Re: Python unicode and Windows cmd.exe

2010-03-14 Thread Guillermo
> 2) My script gets output from a Popen call (to execute a Powershell > script [new Windows shell language] from Python; it does make sense!). > I suppose changing the Windows codepage for a single Popen call isn't > straightforward/possible? Nevermind. I'm able to change Windows' codepage to 6500

Re: Python unicode and Windows cmd.exe

2010-03-14 Thread Neil Hodgson
Guillermo: > 2) My script gets output from a Popen call (to execute a Powershell > script [new Windows shell language] from Python; it does make sense!). > I suppose changing the Windows codepage for a single Popen call isn't > straightforward/possible? You could try SetConsoleOutputCP and Set

Re: Python unicode and Windows cmd.exe

2010-03-14 Thread Guillermo
>    The console is commonly using Code Page 437 which is most compatible > with old DOS programs since it can display line drawing characters. You > can change the code page to UTF-8 with > chcp 65001 That's another issue in my actual script. A twofold problem, actually: 1) For me chcp gives 850

Re: Python unicode and Windows cmd.exe

2010-03-14 Thread Neil Hodgson
Guillermo: > Is this an enforced convention under Windows, then? My head's aching > after so much pulling at my hair, but I have the feeling that the > problem only arises when text travels through the dos console... The console is commonly using Code Page 437 which is most compatible with old

Re: Python unicode and Windows cmd.exe

2010-03-14 Thread Terry Reedy
On 3/14/2010 4:40 PM, Guillermo wrote: Hi, I would appreciate if someone could point out what am I doing wrong here. Basically, I need to save a string containing non-ascii characters to a file encoded in utf-8. If I stay in python, everything seems to work fine, but the moment I try to read t

Re: Python unicode and Windows cmd.exe

2010-03-14 Thread Joaquin Abian
On 14 mar, 22:22, Guillermo wrote: > >    That is what happens: the file now starts with a BOM \xEB\xBB\xBF as > > you can see with a hex editor. > > Is this an enforced convention under Windows, then? My head's aching > after so much pulling at my hair, but I have the feeling that the > problem o

Re: Python unicode and Windows cmd.exe

2010-03-14 Thread Guillermo
>    That is what happens: the file now starts with a BOM \xEB\xBB\xBF as > you can see with a hex editor. Is this an enforced convention under Windows, then? My head's aching after so much pulling at my hair, but I have the feeling that the problem only arises when text travels through the dos co

Re: Python unicode and Windows cmd.exe

2010-03-14 Thread Neil Hodgson
Guillermo: > I then open the file m.txt with notepad, and I see "mañana" normally. > I save (again, no actual modifications), go back to the dos prompt, do > type m.txt and this time it works! I get "mañana". When notepad opens > the file, the encoding is already UTF-8, so short of a UTF-8 bom bei