On Mon, Feb 11, 2002, Ilya Konstantinov wrote about "Re: Linux filenames with definite encoding (Was: FTP server with intl support)": > I wasn't suggesting readdir should have another argument to specify the > desired encoding, but rather that a standard encoding should be chosen. > e.g. the ext2 standard should be revised to allow specifying that a > given filename is in Unicode encoding. > > Eventually, glibc should offer u_readdir() and readdir(). > u_readdir() would return the filename in UTF-8 encoding (by asking the > kernel for the Unicode filenames via the new syscall) > readdir() would also call the kernel's new syscall and then convert the > filenames to the locale's encoding.
Next thing you'll ask for u_read() and u_write() for writing/reading unicode text from files... No, UNIX traditionally operates on strings of "chars" (bytes/octets). No special treatment is ever given by system calls to any byte except null (and "/" in pathnames) - not nl/cr, not ascii/nonascii, or anything of that sort. The only thing "required" of this encoding is to leave nulls and slashes alone (i.e., no encoding of another character can contain slashes or nulls), and both UTF8 and ISO-8859-* encodings indeed have that feature. Having filenames in (say) UTF8 should be a convention, just like putting binaries in (say) "/usr/bin" is a convention: it isn't a requirement of the kernel, and not even a requirement of the filesystem. -- Nadav Har'El | Monday, Feb 11 2002, 29 Shevat 5762 [EMAIL PROTECTED] |----------------------------------------- Phone: +972-53-245868, ICQ 13349191 |Someone offered you a cute little quote http://nadav.harel.org.il |for your signature? JUST SAY NO! ================================================================= To unsubscribe, send mail to [EMAIL PROTECTED] with the word "unsubscribe" in the message body, e.g., run the command echo unsubscribe | mail [EMAIL PROTECTED]