Nadav Har'El wrote on Tue, Mar 13, 2012 at 22:16:23 +0200: > On Tue, Mar 13, 2012, Elazar Leibovich wrote about "Re: Unicode in C": > > Something very important, one need to consider is Unicode normalization. > > That is, how to strip out the Niqud, and to substitute, say KAF WITH DAGESH > > (U+FB3B) with just a KAF (U+05DB) etc. > > Is this really important? Does anybody actually use "Kaf with Dagesh" ? > Why does it even exist? :( >
FWIW, Unicode normalization isn't just about ignoring niqud, it's also about having >=2 equivalent forms for the same object--- such as é (U+00e9) and ́e (U+0301,U+0065). I'm not sure whether this particular issue applies to Hebrew. Daniel (maybe you knew this already) > I noticed there are even more bizarre characters, like "HEBREW LETTER > ALEF WITH MAPIQ" (!?), "HEBREW LIGATURE ALEF LAMED", "HEBREW LETTER WIDE > ALEF", "HEBREW LETTER ALEF WITH QAMATS" (Is Yiddish called Hebrew now??) > "HEBREW LETTER ALTERNATIVE AYIN", and other junk. Why do these exit? > This is sad. > > Nadav. > > > -- > Nadav Har'El | Tuesday, Mar 13 2012, > n...@math.technion.ac.il > |----------------------------------------- > Phone +972-523-790466, ICQ 13349191 |War doesn't determine who's right but > http://nadav.harel.org.il |who's left. > > _______________________________________________ > Linux-il mailing list > Linux-il@cs.huji.ac.il > http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il _______________________________________________ Linux-il mailing list Linux-il@cs.huji.ac.il http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il