Mark Dickinson <[EMAIL PROTECTED]> added the comment: It looks like your conjectures are right in both cases.
I tried adding a few lines to Modules/python.c to print out the argv entries as byte strings, before they're passed to mbstowcs. Results on OS X 10.5: > 1. Somebody runs "a.py ภาษาไทย" in a Terminal.app window. Most likely, > the terminal encoding is applied, which we should assume to be UTF-8 > (although it might be different on some systems). Yes, it appears that the terminal encoding is applied, if I'm reading the results right. Trying ./python.exe a.py é with the terminal character encoding set to "Unicode (UTF-8)", Python receives the third argument as bytes([195, 169]). With the terminal encoding set to "Western (ISO Latin 1)" instead, Python receives bytes([233]). > 2. Somebody creates a file japanese_コンテンツ in the finder, then uses > shell completion to pass this to a Python script. Here I expect that > UTF-8 is used even if the terminal's encoding is not UTF-8. Yes. Python seems to receive the same string regardless of terminal encoding. (With the terminal encoding set to latin1, the tab-completed filename looks like garbage within Terminal, of course.) _______________________________________ Python tracker <[EMAIL PROTECTED]> <http://bugs.python.org/issue4388> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com