And arguably, I have also wanted this since long, instead of the hacks introduced by the so called "double" diacritics and "half" diacritics that break the character identity of those diacritics and also introduce encoding ambiguities.
In fact, those things would have been encoded since long if Unicode and ISO 10646 had extended their character model to cover a broader range of "structured character clusters". Two format characters (with combining class 0 for the purpose of normalizations) would have been enough for most applications: - U+xxx0 BEGIN EXTENDED CLUSTER (BEC) - U+xxx1 END EXTENDED CLUSTER (EEC) And then you would have encoded the standard diacritics after the sequence enclosed by these characters, for example cartouches (using an enclosing diacritic). A third format control would have been used as well to specify that two clusters (simple letters or letters with simple diacritics, and including extended clusters) would stack vertically instead of horizontally. With this third one, the basic structure would be encodable really as plain-text. Yes this would have not worked with today's OpenType specifications, but this would have been the place for extending those specifications and not something blocking the encoding process. i am still convinced that this should not be part of an "upper-layer standard', which is not interoperable, and complicates the integration of those pseudo-encoded texts. Once the structure is encoded as such, there is still the possibility to create a linear graphical representation as a reasonnable readable fallback exhibiting the structure unambiguously, even if the text renderer cannot produce the 2D layout (you just need to make those format controls visible by themselves with a glyph, or some other meaning offered in the text renderer, including with colors or various style effects). 2011/11/14 Shriramana Sharma <[email protected]>: > That is not what he asked. He wants more than one base character to combine > with a single combining mark.

