Roger Leigh wrote:
On Sat, Sep 01, 2012 at 07:32:48PM -0400, Dan B. wrote:
...
Which common programs (e.g., getty, xterm/etc., sed/grep?) do something
different based on the charset portion of the local setting?
All of them, in short.
When you run a terminal emulator such as xterm, it will get the
encoding to use inside the emulator using nl_langinfo(3). ...
What about the virtual consoles?
Whether I choose a default system locale of UTF-8 or None (in the
dialog for "dpkg-reconfigure locales"), and log out and log in (to
make sure the shell has a chance to get fresh settings), then
echo $'\xC2\xA2'
displays the same thing (the cent sign).
Is the virtual console supposed to follow the locale's character
encoding? If so, does something else (e.g., something in /etc/init.d/)
need to be run to make a difference?
No, I'm not actually trying to turn off using UTF-8. I'm just trying
to find out how things work (what actually is affected by the locale
settings).
Actually, what I really want to know is how to revert the sorting of
file names from ls (and Emacs dired listings) from the order caused
by having "en_US" in LANG=en_US.UTF-8 back to the traditional (old)
Unix order (e.g., what LANG=C would yield) without messing up all the
UTF-8 support that's all over Linux now.
First of all, can UTF-8 be combined with the "C" locale as in
LANG=C.UTF-8?
Do I probably want something closer to LANG=en_US.UTF-8 LC_COLLATE=C
(in order to reduce the amount of locale settings I'm overriding)?
When you run sed/grep, the encoding will affect how it processes the
text.
Are you sure about sed?
I tried probing how LANG= vs. LANG=en_US.UTF-8 affected whether
the regular expression "[a-z]" matched "X". Grep seems to be
affected as expected, but sed never matched. (That's on Squeeze.)
Daniel
--
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/50441ffc.7040...@kempt.net