Marc-Andre Lemburg <m...@egenix.com> added the comment:

STINNER Victor wrote:
> 
> New submission from STINNER Victor <victor.stin...@haypocalc.com>:
> 
> Python3 uses unicode filenames in Windows and bytes filenames (but support 
> also unicode filenames) on other OS. We have to support both types. On POSIX 
> system, bytes filenames can be stored in unicode filenames using 
> sys.getfilesystemencoding() and the surrogateescape error handler (to store 
> undecodable bytes as unicode surrogates, see PEP 383).
> 
> I would like to create fs_encode() and fs_decode() in os.path to ease the 
> manipulation of filenames in the two bytes (str and bytes).
>  * Use fs_decode() to convert a filename from the OS native format to unicode
>  * Use fs_encode() to convert an unicode filename to the OS native format
> 
> On Windows, fs_decode() and fs_encode() don't touch the filename, but reject 
> filenames of types different than str (unicode) with a TypeError, especially 
> bytes filename.
> 
> Mac OS X rejects invalid UTF-8 filenames, and so surrogateescape should maybe 
> not be used on this OS.
> 
> Attached patch is an implementation of this issue.

Please follow the naming convention used in os.path. The functions
would have to be called os.path.fsencode() and os.path.fsdecode().

Other than that, I'm +0 on the patch: the sys.filesystemencoding logic
doesn't really work well in practice - on Unix and BSD platforms, there's
no such thing as a single system-wide file system and consequently,
the file system encoding depends on the path you are looking at. For most
of those file systems, the name is just a sequence of bytes with arbitrary
encoding.

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue8514>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to