At 01:05 PM 6/11/2001 -0700, Russ Allbery wrote:
>Dan Sugalski <[EMAIL PROTECTED]> writes:
>
> > Should perl's regexes and other character comparison bits have an option
> > to consider different characters for the same thing as identical beasts?
> > I'm thinking in particular of the Katakana/Hiragana bits of japanese,
> > but other languages may have the same concepts.
>
>I think canonicalization gets you that if that's what you want.
I don't think canonicalization should do this. (I really hope not) This
isn't really a canonicalization matter--words written with one character
set aren't (AFAIK) the same as words written with the other, and which
alphabet you use matters. (Which sort of argues against being able to do
this, I suppose...)
>I
>definitely think that Perl should be able to do all of NFD, NFC, NFKD, and
>NFKC canonicalization.
C & D at least. KC & KD are doable as well, though I'm not sure when you'd
want them. (But who am I to decide?)
Dan
--------------------------------------"it's like this"-------------------
Dan Sugalski even samurai
[EMAIL PROTECTED] have teddy bears and even
teddy bears get drunk