Marc-Andre Lemburg <m...@egenix.com> added the comment: STINNER Victor wrote: > > STINNER Victor <victor.stin...@haypocalc.com> added the comment: > >> I think that using ASCII is a safer choice in case of errors. > > I choosed UTF-8 to keep backward compatibility: > PyUnicode_DecodeFSDefaultAndSize() uses utf-8 if > Py_FileSystemDefaultEncoding==NULL. If the OS has no nl_langinfo(CODESET) > function at all, Python3 uses utf-8.
Ouch, that was a poor choice. In Python we have a tradition to avoid guessing, if possible. Since we cannot guarantee that the file system will indeed use UTF-8, it would have been safer to use ASCII. Not sure why this reasoning wasn't applied for the file system encoding. Nothing we can do about now, though. >> Using UTF-8 may be safe for reading file names, but it's not >> safe for creating files or directories. > > Well, I don't know. You are maybe right. And which encoding should be used if > nl_langinfo(CODESET) function is missing: ASCII or UTF-8? > > UTF-8 is also an optimist choice: I bet that more and more OS will move to > UTF-8. I think we should also add a new environment variable to override the automatic determination of the file system encoding, much like what we have for the I/O encoding: PYTHONFSENCODING: Encoding[:errors] used for file system. (that would need to go on a new ticket, though) ---------- _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue8610> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com