FWIW, combining marks were not actually added to support emojis.  Emojis
are just one of the more popular uses of the feature.  Combining marks is a
standard Unicode feature necessary to represent single “characters” in some
complex situations (e.g. when it is necessary to distinguish between tréma
and umlaut, or to represent certain characters in Navajo).

That being said I agree with the conclusions.  It’s ok to leave out for now
and no need to link to any docs.

On Mon, May 17, 2021 at 5:31 AM Antoine Pitrou <anto...@python.org> wrote:

>
> I'm fine with pointing out that the function operates on codepoints.
>
> Linking to the Unicode documentation for emojis sounds entirely like a
> distraction, though.
>
> Regards
>
> Antoine.
>
>
> Le 17/05/2021 à 17:28, Ian Cook a écrit :
> > +1 for clarifying this in the kernel documentation, referring to these
> > multi-emoji glyphs as "emoji ZWJ sequences," and linking to
> > https://unicode.org/emoji/charts/emoji-zwj-sequences.html
> >
> > Ian
> >
> >
> > On Mon, May 17, 2021 at 11:21 AM Antoine Pitrou <anto...@python.org>
> wrote:
> >>
> >>
> >> Le 17/05/2021 à 17:17, David Li a écrit :
> >>> A little clarification on my point: it's not that a single codepoint
> >>> gets encoded with more than four bytes, it's that a grapheme
> >>> cluster/human-delimited 'character' might be multiple codepoints, so
> >>> reversing the individual codepoints may produce an unexpected
> >>> result. For instance a flag emoji is actually two codepoints (two
> >>> special 'letter' codepoints that represent the country code), so
> >>> reversing a US flag naively will give you an odd '[SU]' instead.
> >>
> >> This sounds like saying that reversing a valid French word does not
> >> produce a valid French word (well, in most cases). The kernel
> >> documentation can't contain an entire tutorial about Unicode characters
> >> and what to expect from them, IMHO.
> >>
> >> Regards
> >>
> >> Antoine.
>

Reply via email to