Under the RL2.2 link of tr18, there appears to be a error: C2. An implementation claiming conformance to Level 2 of this specification shall satisfy C1, and meet the requirements described in the following sections:
RL2.1 Canonical Equivalents RL2.2 Extended Grapheme Clusters RL2.3 Default Word Boundaries RL2.4 Default Loose Matches RL2.5 Name Properties RL2.6 Wildcards in Property Values Following the RL2.2 link, you find this: 2.2 Extended Grapheme Clusters One or more Unicode characters may make up what the user thinks of as a character. To avoid ambiguity with the computer use of the term character, this is called a grapheme cluster. For example, "G" + acute-accent is a grapheme cluster: it is thought of as a single character by users, yet is actually represented by two Unicode characters. The Unicode Standard defines extended grapheme clusters that keep Hangul syllables together and do not break between base characters and combining marks. The precise definition is in UTR #29: Text Boundaries [UAX29]. These extended grapheme clusters are not the same as tailored grapheme clusters, which are covered in Level 3, Tailored Grapheme Clusters. RL3.12 Extended Grapheme Clusters To meet this requirement, an implementation shall provide a mechanism for matching against an arbitrary extended grapheme cluster, a literal cluster, and matching extended grapheme cluster boundaries. Do you guys imagine that that should be "RL2.2" there instead of "RL3.12"? Why should RL2.2 -> 2.2 -> RL3.12? Or is that actually talking about tailored grapheme clusters? I can't tell. --tom