On Fri, Jun 21, 2013 at 2:27 AM, <wxjmfa...@gmail.com> wrote: > And all these coding schemes have something in common, > they work all with a unique set of code points, more > precisely a unique set of encoded code points (not > the set of implemented code points (byte)). > > Just what the flexible string representation is not > doing, it artificially devides unicode in subsets and try > to handle eache subset differently. >
UTF-16 divides Unicode into two subsets: BMP characters (encoded using one 16-bit unit) and astral characters (encoded using two 16-bit units in the D800::/5 netblock, or equivalent thereof). Your beloved narrow builds are guilty of exactly the same crime as the hated 3.3. ChrisA -- http://mail.python.org/mailman/listinfo/python-list