> Date: Fri, 22 May 2015 22:52:24 +0200 > From: Zdenek Wagner <zdenek.wag...@gmail.com>
> The requirement of the Indic specification is to display the dotted > circle if the mark cannot be combined. Aha! Thank you the pointer. I assume you're referring to this? http://www.microsoft.com/typography/otfntdev/indicot/other.htm Based purely on the text, the situation is still a bit murky, though. Most seriously, the Indic specification is based on Unicode 3.1 and if everything in that section is meant to be normative, it's badly out-of-date with respect to more recent versions of Unicode. For one thing, it recommends attaching standalone combining marks to a space, but Unicode now recommends U+00A0 NO-BREAK SPACE for that purpose. More to the point, the Indic specification says Uniscribe displays these marks using the fallback rendering mechanism defined in the Unicode Standard (section 5.12, 'Rendering Non-Spacing Marks' of the Unicode Standard 3.1), i.e. positioned on a dotted circle. First, this is only describing how Uniscribe handles this situation; its not clear that makes this behaviour a normative part of the Indic script specification. Second, that is no longer what Unicode recommends as the default fallback rendering in this situation: In a degenerate case, a nonspacing mark occurs as the first character in the text or is separated from its base character by a line separator, paragraph separator, or other format character that causes a positional separation. This result is called a defective combining character sequence (see Section 3.6, Combination). Defective combining character sequences should be rendered as if they had a no-break space as a base character. (See Section 7.9, Combining Marks.) http://www.unicode.org/versions/Unicode7.0.0/UnicodeStandard-7.0.pdf, page 221. (This wording goes back at least as far as Unicode 5.0, where it occurs at the bottom of page 173. Alas, I no longer have a copy of Unicode 3.0 at home, so I can't check the exact working used in it.) On the other hand, as enjoyable as it is to play language lawyer with the Unicode specification, I'm happy to concede the point that I should just precede isolated characters by U+00A0 and everything will be ok. I'm much more vexed by the malfunctioning Vedic accents. I live in hope that that can be fixed so I don't have to throw away my TECkit transliteration engine and start anew with luaTeX. Cheers, David. -------------------------------------------------- Subscriptions, Archive, and List information, etc.: http://tug.org/mailman/listinfo/xetex