On Wed, Jul 31, 2013 at 6:45 AM, Steven D'Aprano <steve+comp.lang.pyt...@pearwood.info> wrote: > if you care about minimizing every possible byte, you should > use a low-level language like C. Then you can give every character 21 > bits, and be happy that you don't waste even one bit.
Could go better! Since not every character has been assigned, and some are specifically banned (eg U+FFFE and U+D800-U+DFFF), you could cut them out of your representation system and save memory! ChrisA -- http://mail.python.org/mailman/listinfo/python-list