On Fri, 8 Jul 2011 15:00:42 -0500, Joshua and Amy <josh.ruth...@gmail.com> wrote: > I'm creating some hyphenation rules for Jarai texts that I'm > interlinearizing. Here's the problem: In various texts, a complex character > such as LATIN SMALL LETTER A WITH BREVE might be encoded as a single code > point (U+0103) or as a combination of code points (LATIN SMALL LETTER A: > U+0061 plus COMBINING BREVE: U+0306).
Can't (shouldn't!) you pass your texts through a Unicode normalization process? Otherwise search on them might not work either, depending on how smart your search tool is. Mike Maxwell -------------------------------------------------- Subscriptions, Archive, and List information, etc.: http://tug.org/mailman/listinfo/xetex