New submission from STINNER Victor <victor.stin...@haypocalc.com>:

Python (2 and 3) is unable to load a module installed in a directory containing 
characters not encodable to the locale encoding. And Python doesn't work if 
it's installed in non-ASCII directory on Windows or with a locale encoding 
different than UTF-8. On Windows, the locale encoding is "mbcs", which is a 
small charset, unable to mix different languages, whereas the file system is 
fully unicode compatible (it uses UTF-16). Python should work with unicode 
strings (wchar_t*, Py_UNICODE* or PyUnicodeObject) instead of byte strings 
(char* or PyBytesObject), especially while loading a Python module.

It's not an easy task because it requires to change a lot of code, especially 
in Python/import.c. I am working on this topic since some months and I have now 
a working patch. It's now possible to run Python from the source tree 
containing a non-ASCII character in C locale (ASCII encoding). Except just a 
minor bug in test_gdb, all tests of the test suite pass.

I posted the whole patch on Rietveld for a review:
http://codereview.appspot.com/1874048

The patch is huge because it fixes different things:

 a) import machinery (import.c, getpath.c, importdl.c, ...)
 b) many error handlers using filenames (compile.c, errors.c, _warnings.c, 
sysmodule.c, ...)
 c) functions using filenames, especially Python full path: log the filename 
(eg. Lib/distutils/file_util.py), filename written to a program output (eg. 
Lib/platform.py)
 d) tests (Lib/test/test_*.py)

(b), (c) and (d) can be fixed before/without (a). But (a) requires other parts 
to work correctly.

If it's not possible to review the patch, I can try to split it in smaller 
parts.

--

Related issues:

 #3080: Full unicode import system
 #4352: imp.find_module() fails with a UnicodeDecodeError 
        when called with non-ASCII search paths
 #8611: Python3 doesn't support locale different than utf8 
        and an non-ASCII path (POSIX)
 #8988: import + coding = failure (3.1.2/win32)

--

See also my email sent to python-dev for more information:
http://mail.python.org/pipermail/python-dev/2010-July/101619.html

----------
components: Interpreter Core, Unicode
messages: 112026
nosy: haypo
priority: normal
severity: normal
status: open
title: Rewrite import machinery to work with unicode paths
versions: Python 3.2

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue9425>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to