STINNER Victor <victor.stin...@haypocalc.com> added the comment:

Oh oh. The situation is not a simple as expected. 3 functions only accept 
Unicode strings and 3 other functions decode "manually" byte strings from the 
ANSI code page.

--

chdir(), rmdir(), unlink(), access(), chmod(), link(), listdir(), 
_getfullpath(), mkdir(), utime(), open(), startfile(), unlink(), stat() and 
lstat() use the ANSI or the wide character API depending on the type of the 
input arguments.

rename(), symlink() and putenv() only use the wide character API. They use 
convert_to_unicode() to convert input arguments to Unicode. Byte strings are 
decoded from the file system encoding using the strict error error.

system(), readlink() and unsetenv() only accept Unicode strings.

--

Possible bugs.

unlink() uses DeleteFileA() for byte string and Py_DeleteFileW() for unicode. 
Py_DeleteFileW() has a special case for symbolic links:

/* override the default DeleteFileW behavior so that directory
symlinks can be removed with this function, the same as with
Unix symlinks */

unsetenv() encodes the variable name to UTF-8, which looks wrong to me.

startfile() encodes the second argument (operation) to UTF-8 and then decode it 
from ASCII to get a wchar_t* string. Why not using simply the "u" format to 
support more than ASCII characters?

It's surprising that unsetenv() only accept Unicode strings, because this 
Python function uses a C function with a bytes API (unsetenv).

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue12084>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to