Op 30-07-13 21:09, wxjmfa...@gmail.com schreef: > Matable, immutable, copyint + xxx, bufferint, O(n) .... > Yes, but conceptualy the reencoding happen sometime, somewhere.
Which is a far cry from your previous claim that it happened every time you enter a char. This of course make your case harder to argue. Because the impact of something that happens sometime, somewhere is vastly less than something that happens everytime you enter a char. > The internal "ucs-2" will never automagically be transformed > into "ucs-4" (eg). It will just start producing wrong results when someone starts using characters that don't fit into ucs-2. >>>> timeit.timeit("'a'*10000 +'€'") > 7.087220684719967 >>>> timeit.timeit("'a'*10000 +'z'") > 1.5685214234430873 >>>> timeit.timeit("z = 'a'*10000; z = z +'€'") > 7.169538866162213 >>>> timeit.timeit("z = 'a'*10000; z = z +'z'") > 1.5815893830557286 >>>> timeit.timeit("z = 'a'*10000; z += 'z'") > 1.606955741596181 >>>> timeit.timeit("z = 'a'*10000; z += '€'") > 7.160483334521416 > > > And do not forget, in a pure utf coding scheme, your > char or a char will *never* be larger than 4 bytes. > >>>> sys.getsizeof('a') > 26 >>>> sys.getsizeof('\U000101000') > 48 Nonsense. >>> sys.getsizeof('a'.encode('utf-8')) 18 -- http://mail.python.org/mailman/listinfo/python-list