What about text that must exist
normalized for other purposes?
Domain names must be normalized to NFC,
for example. Will such strings display correctly if passed to USE?
A./
On 8/7/2019 1:39 PM, Andrew Glass via
Unicode wrote:
That's correct, the Microsoft implementation of USE spec does not normalize as part of the shaping process. Why? Because the ccc system for non-Latin scripts is not a good mechanism for handling complex requirements for these writing systems and the effects of ccc-based normalization can disrupt authors intent. Unfortunately, because we cannot fix ccc values, shaping engines at Microsoft have ignored them. Therefore, recommendation for passing text to USE is to not normalize.By the way, at the current time, I do not have a final consensus from Tai Tham experts and community on the changes required to support Tai Tham in USE. Therefore, I've not been able to make the changes proposed in this thread. Cheers, Andrew -----Original Message----- From: Richard Wordingham <richard.wording...@ntlworld.com> Sent: 07 August 2019 13:29 To: Richard Wordingham via Unicode <unicode@unicode.org> Cc: Andrew Glass <andrew.gl...@microsoft.com> Subject: Re: What is the time frame for USE shapers to provide support for CV+C ? On Tue, 14 May 2019 03:08:04 +0100 Richard Wordingham via Unicode <unicode@unicode.org> wrote:On Tue, 14 May 2019 00:58:07 +0000 Andrew Glass via Unicode <unicode@unicode.org> wrote:Here is the essence of the initial changes needed to support CV+C. Open to feedback. * Create new SAKOT class SAKOT (Sk) based on UISC = Invisible_Stacker * Reduced HALANT class Now only HALANT (H) based on UISC = Virama * Updated Standard cluster mode [< R | CS >] < B | GB > [VS] (CMAbv)* (CMBlw)* (< < H | Sk > B | SUB[VS] (CMAbv)* (CMBlw)*)* [MPre] [MAbv] [MBlw] [MPst] (VPre)* (VAbv)* (VBlw)* (VPst)* (VMPre)* (VMAbv)* (VMBlw)* (VMPst)* (Sk B)* (FAbv)* (FBlw)* (FPst)* [FM]This next question does not, I believe, affect HarfBuzz. Will NFC code render as well as unnormalised code? In the first example above, <TONE-2, SAKOT, LOW YA> normalises to <SAKOT, TONE-2, LOW YA>, which does not match any portion of the regular _expression_.Could someone answer this question, please? The USE documentation ("CGJ handling will need to be updated if USE is modified to support normalization") still implies that the USE does not respect canonical equivalence. Richard.
|
- What is the time frame for USE shapers to p... Ed Trager via Unicode
- Re: What is the time frame for USE sha... Richard Wordingham via Unicode
- RE: What is the time frame for USE sha... Andrew Glass via Unicode
- Re: What is the time frame for USE... Richard Wordingham via Unicode
- Re: What is the time frame for... Richard Wordingham via Unicode
- Re: What is the time frame... 梁海 Liang Hai via Unicode
- Re: What is the time frame for... Richard Wordingham via Unicode
- RE: What is the time frame... Andrew Glass via Unicode
- Re: What is the time ... Asmus Freytag via Unicode
- RE: What is the t... Andrew Glass via Unicode
- Re: What is t... Asmus Freytag (c) via Unicode
- RE: What is t... Andrew Glass via Unicode
- Re: What is t... Asmus Freytag via Unicode
- Re: What is t... Richard Wordingham via Unicode
- Re: What is the t... Richard Wordingham via Unicode
- Re: What is t... Asmus Freytag via Unicode