Ian Foote:

Specifically, indexing a variable-length encoding like utf-8 is not as
efficient as indexing a fixed-length encoding.

Many common string operations do not require indexing by character which reduces the impact of this inefficiency. UTF-8 seems like a reasonable choice for an internal representation to me. One benefit of UTF-8 over Python's flexible representation is that it is, on average, more compact over a wide set of samples.

   Neil
--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to