On 2013-12-02, Ethan Furman <et...@stoneleaf.us> wrote: > On 11/29/2013 04:44 PM, Steven D'Aprano wrote: >> Out of the nine tests, Python 3.3 passes six, with three tests >> being failures or dubious. If you believe that the native >> string type should operate on code-points, then you'll think >> that Python does the right thing. > > I think Python is doing it correctly. If I want to operate on > "clusters" I'll normalize the string first.
Normalizing doesn't resolve the issues the blog brings up; NFC can't condense every multi-code-point sequence into one, and normalizing can lose or mangle information. There are good examples here: http://unicode.org/reports/tr15/ > Thanks for this excellent post. Agreed. -- Neil Cerutti -- https://mail.python.org/mailman/listinfo/python-list