This flexible string representation is wrong by design. Expecting to divide "Unicode" in chunks and to gain something is an illusion. It has been created by a computer scientist who thinks "bytes" when on that field one has to think "bytes" and usage of the characters at the same time. The latin-1 chunk illustrates this wonderfully.
jmf -- http://mail.python.org/mailman/listinfo/python-list