Toshio Kuratomi added the comment: Looking at the glib code, this looks like the SO post is closer to the truth. The API documentation for g_filename_to_utf8() is over-simplified to the point of confusion. This section of the glib API document is closer to what the code is doing: https://developer.gnome.org/glib/stable/glib-Character-Set-Conversion.html#file-name-encodings
* When encoding matters, glib and gtk functions will assume that char*'s that you pass to them point to strings which are encoded in utf-8. * When char* are not utf8 you are responsible for converting them to utf8 to be used by the glib functions (if encoding matters). * glib provides g_filename_to_utf8() for the special case of transforming filenames into the encoding that glib expects. (Presumably because glib and gtk deal with non-utf8 unicode filenames more often than the equivalent environment variables, command line switches, etc). * Contrary to the API docs for g_filename_to_utf8(), g_filename_to_utf8() will simply return a copy of the byte string it was passed unless G_FILENAME_ENCODING or G_BROKEN_FILENAMES is set. If those are set, then the value of G_FILENAME_ENCODING might be used to attempt to decode the filename or the encoding specified in the user's locale might be used. @haypo, I'm pretty sure from reading the code for g_get_filename_charsets() that you have the conditionals reversed. What I'm seeing is: if G_FILENAME_ENCODING: charset = the first charset listed in G_FILENAME_ENCODING if charset == '@locale': charset = charset of user's locale elif G_BROKEN_FILENAMES: charset = charset of user's locale else: charset = 'UTF-8' ---------- _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue19846> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com