> Gah. I thought (and I use the word loosely here) that locales generally
> specified how a particular character should be interpreted when there's
> some ambiguity--the high bit ASCII characters spring to mind, given there's
> a dozen or more different interpretations with them. I was under the
> impression that given an encoding and a locale, there was no ambiguity and
> that the interpretation of a particular character was exact. In the Big5
> case, I'd assume that there'd be at least two different
> locales--Traditional Chinese and Simplified Chinese--that governed how the
> characters are interpreted.
>
> I get the feeling I'm being rather naive here, huh?
Well, single-minded with a purpose, maybe :-)
I was being too broad, sorry if I threw you into pits of despair.
*If* we are talking about pile of octects and an encoding, yes,
then I think we have an unambiguous thing.
But a locale is a collection of user preferences. How I want
my dates to be formatted, how I want my strings to be sorted.
Encodings and locales are somewhat orthogonal. A locale may
be "clarified" by appending an encoding to the locale name,
e.g. fr_CA.ISO8859-1, ja_JP.SJIS. But what does that actually
*mean*, I have no idea and have not seen a standard that would
explain it. That you want your messages (locale category
LC_MESSAGES) in that encoding? That you want your dates
in that encoding?
--
$jhi++; # http://www.iki.fi/jhi/
# There is this special biologist word we use for 'stable'.
# It is 'dead'. -- Jack Cohen