Chris Angelico <ros...@gmail.com> writes:
> So, I don't actually have any stats for you, because it's really easy
> to just not index strings at all.

Right, that's why I think the O(n) indexing issue of UTF-8 may be
overblown.  Haskell 98 was mentioned earlier as a language that did
Unicode "correctly", but its strings are linked lists of code points.
They are a performance pig to be sure but the O(n) indexing is usually
not the bottleneck.  These days there is a "Text" module that I think is
basically UTF-16 arrays.  I have been meaning to find out what happens
with non-BMP characters.

-- 
http://mail.python.org/mailman/listinfo/python-list

Reply via email to