On Sun, Jan 5, 2014 at 1:41 PM, Steven D'Aprano <steve+comp.lang.pyt...@pearwood.info> wrote: > wxjmfa...@gmail.com wrote: > >> The very interesting aspect in the way you are holding >> unicodes (strings). By comparing Python 2 with Python 3.3, >> you are comparing utf-8 with the the internal "representation" >> of Python 3.3 (the flexible string represenation). > > This is incorrect. Python 2 has never used UTF-8 internally for Unicode > strings. In narrow builds, it uses UTF-16, but makes no allowance for > surrogate pairs in strings. In wide builds, it uses UTF-32.
That's for Python's unicode type. What Robin said was that they were using either a byte string ("str") with UTF-8 data, or a Unicode string ("unicode") with character data. So jmf was right, except that it's not specifically to do with Py2 vs Py3.3. ChrisA -- https://mail.python.org/mailman/listinfo/python-list