wxjmfa...@gmail.com wrote: > The very interesting aspect in the way you are holding > unicodes (strings). By comparing Python 2 with Python 3.3, > you are comparing utf-8 with the the internal "representation" > of Python 3.3 (the flexible string represenation).
This is incorrect. Python 2 has never used UTF-8 internally for Unicode strings. In narrow builds, it uses UTF-16, but makes no allowance for surrogate pairs in strings. In wide builds, it uses UTF-32. Other implementations, such as Jython or IronPython, may do something else. -- Steven -- https://mail.python.org/mailman/listinfo/python-list