Thanks for your response. I'm not sure that you and I are speaking about exactly the same things, since you state that the logical extensions, if not outright goals, of an alternate approach would be an exclusionary monoculture. I'm not sure that's quite right....
On Tue, 2004-04-13 at 15:06, Dan Sugalski wrote: > >> *) Provides language-sensitive manipulation of characters (case mangling) > >> *) Provides language-sensitive comparisons > > > >Those two things do not seem to me to need language-specific strings at > >all. They certainly need to understand the language in which they are > >operating (avoiding the use of the word locale here, as per Larry's > >concerns), but why does the language of origin of the string matter? > > Because the way a string is upcased/downcased/titlecased depends on > the language the string came from. The treatment of accents and a > number of specific character sequences depends on the language the > string came from. > Ignore it and, well, you're going to find that > you're messing up the display of someone's name. That strikes me as > rather rude. For proper names, you may have a point (though the ordering of names in a phone book, for example, is often according to the language of the book, not the origin of the names), and in some forms of string processing, that kind of deference to the origin of a word may turn out to be useful. I do "get" that much. What I'm not getting is * Why do we assume that the language property of a string will be the language from which the word correctly originates rather than the locale of the database / web site / file server / whatever that we received it from? That could actually result in dealing with native words according to the rules of foreign languages, and boy-howdy is that going to be fun to debug. * Why is it so valuable as to attach a value to every string ever created for it rather than creating an abstraction at a higher level (e.g. a class) * Why wouldn't you do the same thing for MIME type, as strings may also (and perhaps more often) contain data which is more appropriately tagged that way? The SpamAssassin guys would love you for this! > What I don't want to do is *force* uniformity. Some of us do care. Hey, that's a bit of a low blow. I care quite a bit, or I would not ask. I'm not saying that the guy who wants to sort names according to their source language is wrong, I'm saying that he doesn't need core support in Parrot to do it, so I'm curious why it's in there. > We've tried the whole monoculture thing before. I just don't think that moving language up a layer or two of abstraction enforces a monoculture... again, I'm willing to see the light if someone can explain it. A lot of your response is about "enforcing", and I'm not sure how I gave the impression of this being an enforcement issue (or perhaps you think that non-localization is something that needs to be enforced?) I just can't see how every string needs to carry around this kind of world-view-altering context when 99% of programs that use string data (even those that use mixed encodings) won't want to apply said context, but rather perform all operations according to their locale. Am I wrong about that? One thing that was not answered, though is what happens in terms of dominance. When sorting French and Norwegian Unicode strings, who loses (wins?) when you try to compare them? Comparing across language boundaries would be a monumental task, and would be instantly reviled as wrong by every language purist in the world (to my knowledge no one has ever published a uniform way to compare two words, much less arbitrary text, unless you are willing to do so using the rules of one and only one culture (and I say culture because often the rules of a culture are mutually incompatible with those of any one source language's strict rules)). So, if you have to convert in order to compare, whose language do you do the comparison in? You can't really rely on LHS vs. RHS, since a sort will reverse these many times (and C<$a cmp $b> had better be C<-($b cmp $a)> or your sort may never terminate!) -- Aaron Sherman <[EMAIL PROTECTED]> Senior Systems Engineer and Toolsmith "It's the sound of a satellite saying, 'get me down!'" -Shriekback