On Sun, 2003-01-05 at 15:13, Denis Barbier wrote: > Consider a program written in C, which creates new files with open(2); > if I understand your proposal right, when a filename is not UTF-8 > encoded, it should be converted into UTF-8 according to user's locale.
Well, broadly speaking, there are two cases: 1) Programs which do not look at the contents of filenames, and just treat them as mostly opaque arguments. Commands like 'touch' fall into this category. We should not need to change them at all; you just start passing UTF-8 instead of ASCII or ISO-8859-1 to them. Any change to glibc would break these programs. 2) Programs which do manipulate filenames. These are trickier. Now, there are several ways to make these programs handle UTF-8. For some of them, no change will be required; stuff like searching for ASCII characters still works with UTF-8. However, if these programs display them to the user on a tty, it will be necessary to convert them to the user's locale encoding (of course, once we make UTF-8 terminals standard, programs will not need to do this.) If they stuff them in a GUI widget, they will have to be sure to tell the widget that they are in UTF-8 (if necessary). > I am wondering how to perform this task: > a. Let open() perform this conversion. No. This would certainly ensure corruption. > b. Add a utility function in a common library and patch all programs > to add calls to this routine. It depends. For some programs, instead of converting the filename back to the user's locale's encoding for internal manipulation (which may fail, remember, since UTF-8 can encode far more than say ISO-8859-1), it would be better to change the program to handle all strings internally as UTF-8. For some programs this will be fairly trivial, for others it may be difficult. Another alternative is to have a small library which will first try decoding a filename using UTF-8 back into the user's locale encoding, and only if that fails, then just take the filename as-is. The best approach will depend on the program, and how it manipulates filenames. > How do you think your proposal should be implemented? I hope that helps.