On Mon, 2003-01-06 at 16:15, Jochen Voss wrote: > Hello Colin, > > On Fri, Jan 03, 2003 at 09:50:26PM -0500, Colin Walters wrote: > > In summary, UTF-8 is the *only* sane character set to use for > > filenames. > At least I agree to this :-)
Cool. > I think that we need filename conversion between UTF-8 and the user's > character set, because we cannot ban all non-UTF8 terminal types. In > my opinion the main problem is, where this conversion should take > place. I will say this much; I simply did not even consider doing this kind of character set conversion as part of glibc or Linux. It just seems like such a horrible kludge that would not actually work in practice. Fundamentally, glibc and Linux cannot know what charset the application itself works in. You might have stuff that undergoes UTF-8 conversion *twice*, once by the application and once by glibc for example. It just seems like a recipie for disaster. > Because a lot of programs is affected, it would gain us much, if we > could move this as deep as into libc or even into the kernel. Again: I argue that we need to change all these programs *anyways*, because you can't just use your same old C library string functions on UTF-8. I know it seems tempting to just stick some code into glibc, but I have serious doubts that will ever work in anything resembling a reliable fashion. Feel free to prove me wrong of course! > Does anybody know: how do they solve the problems we discuss here? > Where do they convert filenames, e.g. when I login via ssh and > type "ls -l Bär*" from my LC_CTYPE=ISO-8859-15 system? I think that it quite simply does not work. > > Again, major chunks of upstream software which have Unicode support > > (like GNOME), are *already* defaulting to interpreting filenames as > > UTF-8 by default. > And how is the conversion done there? What conversion? GNOME apps speak UTF-8 natively, and that's about all they speak unless you set the G_BROKEN_FILENAMES environment variable.