On Sat, Feb 21, 2009 at 9:10 PM, "Martin v. Löwis" <mar...@v.loewis.de> wrote: >>> I'm pretty much sure it is UCS-2 or UCS-4. (Yes, I know there is only a >>> slight difference to UTF-16/UTF-32). >> >> I wouldn't call the difference that slight, especially between UTF-16 >> and UCS-2, since the former can encode all Unicode code points, while >> the latter can only encode those in the BMP. > > Indeed. As Python *can* encode all characters even in 2-byte mode > (since PEP 261), it seems clear that Python's Unicode representation > is *not* strictly UCS-2 anymore.
Since we're already discussing this, I'm curious - why was UCS-2 chosen over plain UTF-16 or UTF-8 in the first place for Python's internal storage? -- Denis Kasak -- http://mail.python.org/mailman/listinfo/python-list