> > In general, they don't.  Command-line utilities just use the sequence
> > of bytes entered by the user.
> 
> Obviously that depends on the application. A command-line utility that
> interprets an normal xml file containing filenames know the characters
> but not the bytes. The same goes for command-line utilities that
> receive the filenames as text (e.g., some file transfer utility or daemon).

It's true that they know the characters, and not necessarily the bytes -- but
all of the tools I'm aware of ignore the characters and simply treat these
as bytes when it comes to making calls into the file system.

> If I run xev on my linux box (I don't have X on any (Open)Solaris) and
> press the Ä-key on my keyboard it says "keycode 48" and "keysym 0xe4",
> and then "XLookupString gives 2 bytes: (c3 a4) "ä"". Thus at least
> XLookupString seems to know that I'm using UTF-8. Where did it (or
> whoever converted 0xe4 to 0xc3a4) get the needed info?

Depending on what version of xev you've got, there's a good chance it made a 
call to XmbLookupString (the "multibyte" version of XLookupString). This uses 
the current locale for the encoding; the locale is stored in an environment 
variable which can be queried by the application. (But this has wandered afield 
of file systems -- though it's true that the file system could potentially look 
at environment variables to make encoding choices!)
 
 
This message posted from opensolaris.org
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to