On Thu, Jul 11, 2013 at 11:18 PM, <wxjmfa...@gmail.com> wrote: > Just to stick with this funny character ẞ, a ucs-2 char > in the Flexible String Representation nomenclature. > > It seems to me that, when one needs more than ten bytes > to encode it, > >>>> sys.getsizeof('a') > 26 >>>> sys.getsizeof('ẞ') > 40 > > this is far away from the perfection.
Better comparison is to see how much space is used by one copy of it, and how much by two copies: >>> sys.getsizeof('aa')-sys.getsizeof('a') 1 >>> sys.getsizeof('ẞẞ')-sys.getsizeof('ẞ') 2 String objects have overhead. Big deal. > BTW, for a modern language, is not ucs2 considered > as obsolete since many, many years? Clearly. And similarly, the 16-bit integer has been completely obsoleted, as there is no reason anyone should ever bother to use it. Same with the float type - everyone uses double or better these days, right? http://www.postgresql.org/docs/current/static/datatype-numeric.html http://www.cplusplus.com/doc/tutorial/variables/ Nope, nobody uses small integers any more, they're clearly completely obsolete. ChrisA -- http://mail.python.org/mailman/listinfo/python-list