On Tue, 2003-01-14 at 02:23, [EMAIL PROTECTED] wrote: > Not acceptable. Filenames are and must be in the locale charset. > There is no other sane option [...]
Heh. I will quote from a previous message of mine about filenames in the locale charset, which, since you joined the discussion later, you might not have seen: On Fri, 2003-01-03 at 18:11, Jochen Voss wrote: > As I see it, the current (broken ?) behaviour is, to use the user's > locale setting (LC_CTYPE) to encode file names. It appears so, and yes, this behavior is completely and fundamentally broken. If you have say a Chinese friend who logs onto your computer, and he sets LANG to something like cn_CN.BIG5, then when he tries to 'ls' your files, it will completely fail. Likewise, when you try to look at his, it will not work at all. Moreover, say the system administrator does something like 'find /home'. The resulting stream will be a mixture of ISO-8859-X and BIG5, and impossible to reliably differentiate. And of course the problem doesn't just occur when you have a multiuser system; your Chinese friend could send you a .ogg file named using BIG5, and your Latin 1 system would simply fail to encode the filename. And finally, having the encoding of filenames dependent on the current locale often doesn't make sense even for a single user; what if you are a software developer in an ISO-8859-1 locale, and you want to test the Japanese translation of your software. So you run it with LANG=ja_JP.ISO-2022-JP or something to get the translations displayed. As a side effect, all the filenames on your system will fail to work. In summary, UTF-8 is the *only* sane character set to use for filenames. Major upstream software for Debian like GNOME is moving towards requiring UTF-8 for filenames, and we should too. > what do you expect "echo *" to do? Quite frankly, I expect it to not work, unless they're using a UTF-8 terminal. > You can't slap > filters around everything; it's horribly buggy, and error-prone and would > take forever to implement, IF everyone wanted to go along with it. I am not sure. I have a feeling we could make "core" programs like 'ls' and such do conversion, but I agree it would be quite a long time before we covered "most" of the programs people use. > The > only sane situation is to transition everything as a whole to UTF-8, > with filterm or the like for legacy terminals. You can't just change > filenames. I think programs should start expecting UTF-8 filenames today, but be able to sanely handle filenames in the locale charset. That way we get the best of both worlds, and minimize the pain of the transition. Note again that GNOME programs and the like are already creating UTF-8 filenames, because they work completely in UTF-8 internally. Now, they *could* try to convert them back to the locale charset. But I would argue strongly against this, because the conversion could fail if the locale's charset isn't able to encode some target characters. That may be an "unlikely" scenario, but when you're dealing with something as fundamental as filenames, you don't want to just ignore "unlikely" scenarios.