Mark Dickinson <[EMAIL PROTECTED]> added the comment:

It looks like your conjectures are right in both cases.

I tried adding a few lines to Modules/python.c to print out the argv 
entries as byte strings, before they're passed to mbstowcs.  Results
on OS X 10.5:

> 1. Somebody runs "a.py ภาษาไทย" in a Terminal.app window. Most likely,
> the terminal encoding is applied, which we should assume to be UTF-8
> (although it might be different on some systems).

Yes, it appears that the terminal encoding is applied, if I'm reading 
the results right.  Trying

./python.exe a.py é

with the terminal character encoding set to "Unicode (UTF-8)", Python 
receives the third argument as bytes([195, 169]).  With the terminal 
encoding set to "Western (ISO Latin 1)" instead, Python receives
bytes([233]).

> 2. Somebody creates a file japanese_コンテンツ in the finder, then uses
> shell completion to pass this to a Python script. Here I expect that
> UTF-8 is used even if the terminal's encoding is not UTF-8.

Yes.  Python seems to receive the same string regardless of terminal 
encoding.  (With the terminal encoding set to latin1, the tab-completed 
filename looks like garbage within Terminal, of course.)

_______________________________________
Python tracker <[EMAIL PROTECTED]>
<http://bugs.python.org/issue4388>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to