On Mon, Aug 27, 2012 at 1:16 PM, <wxjmfa...@gmail.com> wrote: > - Why int32 and not uint32? No idea, I tried to find an > answer without asking.
UCS-4 is technically only a 31-bit encoding. The sign bit is not used, so the choice of int32 vs. uint32 is inconsequential. (In fact, since they made the decision to limit Unicode to the range 0 - 0x0010FFFF, one might even point out that the *entire high-order byte* as well as 3 bits of the next byte are irrelevant. Truly, UTF-32 is not designed for memory efficiency.) -- http://mail.python.org/mailman/listinfo/python-list