Not sure about your exact case, but ICU's normalization does handle those characters.
http://unicode.org/cldr/utility/transform.jsp?a=nfc%3Bhex&b=%5Cu30B9%5Cu3099 (That tool uses ICU for NFC). Mark <https://google.com/+MarkDavis> *— Il meglio è l’inimico del bene —* On Tue, Mar 11, 2014 at 4:50 PM, Markus Doppelbauer <[email protected]>wrote: > Hello, > > I have an other problem making the normalization process binary > compatible with ICU. > Why does "30B9 3099" not combine to "30BA"? > > Steps to reproduce: > wget http://doppelbauer.name/katakana.txt > uconv -f utf8 -t utf8 -x nfd <katakana.txt >ndf.txt > uconv -f utf8 -t utf8 -x nfc <ndf.txt >nfc.txt > diff katakana.txt nfc.txt > > Expected result: "katakana.txt" == "nfc.txt" > > uconv v2.1 ICU 4.8.1.1 > > Thanks a lot > Markus > > > > _______________________________________________ > Unicode mailing list > [email protected] > http://unicode.org/mailman/listinfo/unicode > >
_______________________________________________ Unicode mailing list [email protected] http://unicode.org/mailman/listinfo/unicode

