STINNER Victor added the comment:

>> Or said differently, the filesystem encoding is different than the
>> locale encoding.

> Indeed, but the FS encoding and the IO encoding are the same.
> "locale encoding" doesn't really matter here, as we are assuming that
> it's wrong.

Oh, I realized that "FS encoding" term in not clear. When I wrote "FS 
encoding", I mean sys.getfilesystemencoding() which is mbcs on Windows, UTF-8 
on Mac OS X and (currently) the locale encoding on other platforms (UNIX, ex: 
Linux/FreeBSD/Solaris/AIX).

--

IMO there are two different points in this issue:

(a) which encoding should be used when the C locale is used: the encoding 
announced by the OS using nl_langinfo(CODESET) (current choice) or use an 
arbitrary optimistic "utf-8" encoding?

(b) for technical reasons, Python reuses the C codec during Python 
initialization to decode and encode OS data, and so currently Python *must* use 
the locale encoding for its "filesystem encoding"

Before being able to pronounce me on the point (a), I would like to see a patch 
fixing the point (b). I'm not against fixing point (b). I'm just saying that 
it's not trivial and obviously it must be fixed to change the status of point 
(a). I even gave clues to fix point (b).

--

asciilocale.patch has many issues. Try to run the Python test suite using this 
patch to see what I mean. Example of failures:

======================================================================
FAIL: test_non_ascii (test.test_cmd_line.CmdLineTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/haypo/prog/python/default/Lib/test/test_cmd_line.py", line 140, 
in test_non_ascii
    assert_python_ok('-c', command)
  File "/home/haypo/prog/python/default/Lib/test/script_helper.py", line 69, in 
assert_python_ok
    return _assert_python(True, *args, **env_vars)
  File "/home/haypo/prog/python/default/Lib/test/script_helper.py", line 55, in 
_assert_python
    "stderr follows:\n%s" % (rc, err.decode('ascii', 'ignore')))
AssertionError: Process return code is 1, stderr follows:
Unable to decode the command from the command line:
UnicodeEncodeError: 'utf-8' codec can't encode character '\udcc3' in position 
12: surrogates not allowed

======================================================================
FAIL: test_ioencoding_nonascii (test.test_sys.SysModuleTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/haypo/prog/python/default/Lib/test/test_sys.py", line 603, in 
test_ioencoding_nonascii
    self.assertEqual(out, os.fsencode(test.support.FS_NONASCII))
AssertionError: b'' != b'\xc3\xa6'

======================================================================
FAIL: test_nonascii (test.test_warnings.CEnvironmentVariableTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/haypo/prog/python/default/Lib/test/test_warnings.py", line 774, 
in test_nonascii
    "['ignore:Deprecaci\xf3nWarning']".encode('utf-8'))
AssertionError: b"['ignore:Deprecaci\\udcc3\\udcb3nWarning']" != 
b"['ignore:Deprecaci\xc3\xb3nWarning']"

======================================================================
FAIL: test_nonascii (test.test_warnings.PyEnvironmentVariableTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/haypo/prog/python/default/Lib/test/test_warnings.py", line 774, 
in test_nonascii
    "['ignore:Deprecaci\xf3nWarning']".encode('utf-8'))
AssertionError: b"['ignore:Deprecaci\\udcc3\\udcb3nWarning']" != 
b"['ignore:Deprecaci\xc3\xb3nWarning']"


test_warnings is probably #9988, test_cmd_line failure is maybe #9992.

There are maybe other issues, the Python test suite only have a few tests for 
non-ASCII characters.

--

If anything is changed, I would prefer to have more than a few months of test 
to make sure that it doesn't break anything. So I set the version field to 
Python 3.5.

----------
versions: +Python 3.5 -Python 3.4

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue19846>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to