Le dimanche 5 janvier 2014 03:54:29 UTC+1, Chris Angelico a écrit : > On Sun, Jan 5, 2014 at 1:41 PM, Steven D'Aprano > > <steve+comp.lang.pyt...@pearwood.info> wrote: > > > wxjmfa...@gmail.com wrote: > > > > > >> The very interesting aspect in the way you are holding > > >> unicodes (strings). By comparing Python 2 with Python 3.3, > > >> you are comparing utf-8 with the the internal "representation" > > >> of Python 3.3 (the flexible string represenation). > > > > > > This is incorrect. Python 2 has never used UTF-8 internally for Unicode > > > strings. In narrow builds, it uses UTF-16, but makes no allowance for > > > surrogate pairs in strings. In wide builds, it uses UTF-32. > > > > That's for Python's unicode type. What Robin said was that they were > > using either a byte string ("str") with UTF-8 data, or a Unicode > > string ("unicode") with character data. So jmf was right, except that > > it's not specifically to do with Py2 vs Py3.3. > >
Yes, the key point is the preparation of the "unicode text" for the PDF producer. This is at this level the different flavours of Python may be relevant. I see four possibilites, I do not know what the PDF producer API is expecting. - Py2 with utf-8 byte string (ev. utf-16, utf-32) - Py2 with its internal unicode - Py3.2 with its internal unicode - Py3.3 with its internal unicode jmf -- https://mail.python.org/mailman/listinfo/python-list