On 30.08.12 09:55, Steven D'Aprano wrote:
And Python's solution uses those: UCS-2, UCS-4, and UTF-8.
I see that this misconception widely spread. In fact Python 3.3 uses four kinds of ready strings.
* ASCII. All codes <= U+007F. * UCS1. All codes <= U+00FF, at least one code > U+007F. * UCS2. All codes <= U+FFFF, at least one code > U+00FF. * UCS4. All codes <= U+0010FFFF, at least one code > U+FFFF. Indexing is O(0) for any string. Also the string can optionally cache UTF-8 and wchar_t* representation. -- http://mail.python.org/mailman/listinfo/python-list