> I think you're basically forcing this concept onto national standards
> which lack it. I don't think that most of the national standards
> actually define the semantics of the characters they encode
> (categorizations, case mapping, sort order), and although they assign
> byte sequences to re
On Apr 27, 2004, at 10:25 AM, Dan Sugalski wrote:
At 9:40 AM -0700 4/27/04, Jeff Clites wrote:
On Apr 23, 2004, at 2:43 PM, Dan Sugalski wrote:
CHARACTER SET - Contains meta-information about code points. This
includes both the meaning of individual code points
(65 i
Dan Sugalski wrote:
> At 7:57 PM +0300 4/27/04, Jarkko Hietaniemi wrote:
>
>> > 1) ISO-8859-1 is used to represent text in several different languages,
>>
>>> including German and Swedish. German and Swedish differ in their sort
>>> order, even for things they have in common. (For example, ö
>>> (
At 7:57 PM +0300 4/27/04, Jarkko Hietaniemi wrote:
> 1) ISO-8859-1 is used to represent text in several different languages,
including German and Swedish. German and Swedish differ in their sort
order, even for things they have in common. (For example, ö
(o-with-diaeresis) is considered a separ
At 9:40 AM -0700 4/27/04, Jeff Clites wrote:
On Apr 23, 2004, at 2:43 PM, Dan Sugalski wrote:
CHARACTER SET - Contains meta-information about code points. This
includes both the meaning of individual code points
(65 is capital A, 776 is a combining diaresis) as
I can't answer for Dan regarding implementation issues, but from
a (computer) language point of view, consistency is better than
correctness on this issue, because there is no single definition of
"correct" until you specify what you mean by "correct". So at the
first three Unicode support levels
> 1) ISO-8859-1 is used to represent text in several different languages,
> including German and Swedish. German and Swedish differ in their sort
> order, even for things they have in common. (For example, ö
> (o-with-diaeresis) is considered a separate letter in Swedish, but is
> just a accent
On Apr 23, 2004, at 2:43 PM, Dan Sugalski wrote:
CHARACTER SET - Contains meta-information about code points. This
includes both the meaning of individual code points
(65 is capital A, 776 is a combining diaresis) as
well as a set of categorizations o