At 02:34 PM 4/20/2001 -0500, Jarkko Hietaniemi wrote:
> > >One additional datapoint to overload your brain with is to consider
> > >the ambiguity of equality and comparison. Unicode normalization:
> > >is A + grave equal to Agrave? Is Agrave less than Aacute? Unicode
> > >collation combined with language/locale-specific rules.
> >
> > Comparisons on Unicode data will do it on the Unicode collation version of
> > the string data. Equality checking will be done either on normalized data
>
>We need to include in our design a spot for the customization hooks, though.
This'll be buried in the vtable code for the various data types. If you do
a comparision with one or both arguments Unicode, then we do the collation
thing. Oherwise the vtable code's free to do whatever it thinks is
appropriate. (We can certainly encapsulate this stuff--Simon and I have
both been considering some sort of loadable string type system)
> > or whatever representation it's in, depending on Larry's call. (I'd prefer
> > normalization form C, but I'm not sure the regularity's worth the CPU
> cost.
> > Telling the programmer to beware might be sufficient)
>
>The NFC seems like the way to go.
I wasn't all that clear. For places where parrot^Wperl 6 normalizes it'll
go for NFC. The question is whether we actually check for/force
normalization, or rely on the programmer to Do The Right Thing.
Dan
--------------------------------------"it's like this"-------------------
Dan Sugalski even samurai
[EMAIL PROTECTED] have teddy bears and even
teddy bears get drunk