Re: Python Unicode handling wins again -- mostly

Neil Cerutti Tue, 03 Dec 2013 05:51:50 -0800

On 2013-12-02, Ethan Furman <et...@stoneleaf.us> wrote:
> On 11/29/2013 04:44 PM, Steven D'Aprano wrote:
>> Out of the nine tests, Python 3.3 passes six, with three tests
>> being failures or dubious. If you believe that the native
>> string type should operate on code-points, then you'll think
>> that Python does the right thing.
>
> I think Python is doing it correctly.  If I want to operate on
> "clusters" I'll normalize the string first.


Normalizing doesn't resolve the issues the blog brings up; NFC
can't condense every multi-code-point sequence into one, and
normalizing can lose or mangle information. There are good
examples here: http://unicode.org/reports/tr15/

> Thanks for this excellent post.

Agreed.

-- 
Neil Cerutti

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Python Unicode handling wins again -- mostly

Reply via email to