On Fri, May 2, 2014 at 2:42 PM, Rustom Mody <rustompm...@gmail.com> wrote: > Unicode consortium's going from old BMP to current (6.0) SMPs to > who-knows-what > in the future is similar.
Unicode 1.0: "Let's make a single universal character set that can represent all the world's scripts. We'll define 65536 codepoints to do that with." Unicode 2.0: "Oh. That's not enough. Okay, let's define some more." It's not a fundamental change, nor is it unhelpful to Unicode's cause. It's simply an acknowledgement that 64K codepoints aren't enough. Yes, that gave us the mess of UTF-16 being called "Unicode" (if it hadn't been for Unicode 1.0, I doubt we'd now have so many languages using and exposing UTF-16 - it'd be a simple judgment call, pick UTF-8/UTF-16/UTF-32 based on what you expect your users to want to use), but it doesn't change Unicode's goal, and it also doesn't indicate that there's likely to be any more such changes in the future. (Just look at how little of the Unicode space is allocated so far.) ChrisA -- https://mail.python.org/mailman/listinfo/python-list