# New Ticket Created by raiph # Please include the string: [perl #125927] # in the subject line of all future correspondence about this issue. # <URL: https://rt.perl.org/Ticket/Display.html?id=125927 >
jnthn++ and others are busy with work that is far more important and urgent than dealing with this right now. I'm filing this bug now because there are reasons to consider addressing it before christmas as explained below. What I did ========== say "नि".chars What I expected =============== 1 What I got ========== 2 ----------------- Some reasons why I think it's appropriate to classify नि as a single grapheme: 1. It's the last of 4 sample single graphemes in the "Extended Grapheme Clusters" section of the Unicode Standard Annex #29 on Text Segmentation: http://www.unicode.org/reports/tr29/tr29-27.html#Table_Sample_Grapheme_Clusters (The Unicode standard suggests aiming at Extended Grapheme Clusters at a minimum if an implementation wishes to deal with grapheme clusters.) 2. It's the first example in S15: https://raw.githubusercontent.com/perl6/specs/master/S15-unicode.pod 3. It behaves as a single unit for selection in my browser. (You too?) -------- The bug I'm reporting in this RT was discussed briefly today on IRC: jnthn m: say "नि".NFC.list.say camelia OUTPUT«2344 2367True» jnthn m: say uniprop(2367, 'Canonical_Combining_Class') camelia OUTPUT«0» jnthn ... combiners are identified in the NFG algo by having a CCC > 0 ---------- So, presumably, to match Unicode's default extended grapheme cluster definition, the CCC > 0 condition is insufficient for identifying combiners, including one that's part of a sample grapheme that the Unicode standard saw fit to highlight. It's this latter point -- that it's become a go-to example on the net -- that's one of the main reasons I'm filing this bug; I don't otherwise use Devanagari!