Matable, immutable, copyint + xxx, bufferint, O(n) .... Yes, but conceptualy the reencoding happen sometime, somewhere. The internal "ucs-2" will never automagically be transformed into "ucs-4" (eg).
>>> timeit.timeit("'a'*10000 +'€'") 7.087220684719967 >>> timeit.timeit("'a'*10000 +'z'") 1.5685214234430873 >>> timeit.timeit("z = 'a'*10000; z = z +'€'") 7.169538866162213 >>> timeit.timeit("z = 'a'*10000; z = z +'z'") 1.5815893830557286 >>> timeit.timeit("z = 'a'*10000; z += 'z'") 1.606955741596181 >>> timeit.timeit("z = 'a'*10000; z += '€'") 7.160483334521416 And do not forget, in a pure utf coding scheme, your char or a char will *never* be larger than 4 bytes. >>> sys.getsizeof('a') 26 >>> sys.getsizeof('\U000101000') 48 jmf -- http://mail.python.org/mailman/listinfo/python-list