Re: Grapheme clusters, a.k.a.real characters

Marko Rauhamaa Sat, 15 Jul 2017 07:08:24 -0700

Steve D'Aprano <steve+pyt...@pearwood.info>:

> On Sat, 15 Jul 2017 05:50 pm, Marko Rauhamaa wrote:
>> I might want random access to the "Grapheme clusters, a.k.a.real
>> characters".
>
> That would be nice to have, but the truth is that for most coders,
> Unicode code points are the low-hanging fruit that get you 95% of the
> way, and for many applications that's "close enough".


I think "close enough" is actually dangerous. We shouldn't encourage
that practice.

> Support for the Unicode grapheme breaking algorithm would get you
> probably 90% of the rest of the way. And then some sort of
> configurable system where defaults were based on the locale would
> probably get you a fairly complete grapheme-based text library.

Yes, that kind of a text class would be useful.

> I'm interested in such a thing. That's why I pointed out the issue on
> the bug tracker, to try to garner interest in it. As far as I can
> tell, you seem to be more interested in cheap point scoring, digs
> against Unicode, and an insistence that UTF-8 is better than strings
> (which doesn't even make sense).

It does seem to me UTF-8 is a better waiting position than strings.
Strings give you more trouble while not truly solving any problems.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Grapheme clusters, a.k.a.real characters

Reply via email to