> 1) ISO-8859-1 is used to represent text in several different languages,including German and Swedish. German and Swedish differ in their sort order, even for things they have in common. (For example, ö (o-with-diaeresis) is considered a separate letter in Swedish, but is just a accented "o" in German.) So (assuming my strings aren't explicitly langauge-tagged, or are tagged with "Dunno"), what sort order does ISO-8859-1 define? I'm not sure whether the national standards themselves actually define a sort order, so are we going to
National standards yes, ISO 8859 (and the like) not. In other words, sorting standards exist, but they have (quite rightly) nothing to do with sorting standards.
?
Real life sorting is messy (multiple passes, some parts may be ignored in some passes, acronyms, etc.) and worlds apart from "let's compare the bytes one by one" or even from "let's compare code points" or even from "let's compare grapheme (clusters)".
True enough, though what I want the language for is as much case-mangling as sorting.
> define one for every "character set"? In addition, many languages canbe represented in several different "character set", so that seems to mean that the sort order for "öut" v. "out" will vary, depending on the "character set" used for those strings?
FWIW, I think binding language to strings is a Mistake. But I have decided to give up trying to argue anymore about it since Dan seems to be convinced that it will solve some problems.
Well, it's always possible that, once we get deeper into this, that
a) I get over the snit and b) I realize what a profoundly stupid idea it was in the first place.
Wouldn't be the first time, and probably not the last either. -- Dan
--------------------------------------"it's like this"------------------- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk