On Tue, 30 Jul 2013 12:09:11 -0700, wxjmfauth wrote: > And do not forget, in a pure utf coding scheme, your char or a char will > *never* be larger than 4 bytes. > >>>> sys.getsizeof('a') > 26 >>>> sys.getsizeof('\U000101000') > 48
Neither character above is larger than 4 bytes. You forgot to deduct the size of the object header. Python is a high-level object-oriented language, if you care about minimizing every possible byte, you should use a low-level language like C. Then you can give every character 21 bits, and be happy that you don't waste even one bit. -- Steven -- http://mail.python.org/mailman/listinfo/python-list