Le vendredi 17 août 2012 20:21:34 UTC+2, Jerry Hill a écrit : > On Fri, Aug 17, 2012 at 1:49 PM, <wxjmfa...@gmail.com> wrote: > > > The character '…', Unicode name 'HORIZONTAL ELLIPSIS', > > > is one of these characters existing in the cp1252, mac-roman > > > coding schemes and not in iso-8859-1 (latin-1) and obviously > > > not in ascii. It causes Py3.3 to work a few 100% slower > > > than Py<3.3 versions due to the flexible string representation > > > (ascii/latin-1/ucs-2/ucs-4) (I found cases up to 1000%). > > > > > >>>> '…'.encode('cp1252') > > > b'\x85' > > >>>> '…'.encode('mac-roman') > > > b'\xc9' > > >>>> '…'.encode('iso-8859-1') # latin-1 > > > Traceback (most recent call last): > > > File "<eta last command>", line 1, in <module> > > > UnicodeEncodeError: 'latin-1' codec can't encode character '\u2026' > > > in position 0: ordinal not in range(256) > > > > > > If one could neglect this (typographically important) glyph, what > > > to say about the characters of the European scripts (languages) > > > present in cp1252 or in mac-roman but not in latin-1 (eg. the > > > French script/language)? > > > > So... python should change the longstanding definition of the latin-1 > > character set? This isn't some sort of python limitation, it's just > > the reality of legacy encodings that actually exist in the real world. > > > > > > > Very nice. Python 2 was built for ascii user, now Python 3 is > > > *optimized* for, let say, ascii user! > > > > > > The future is bright for Python. French users are better > > > served with Apple or MS products, simply because these > > > corporates know you can not write French with iso-8859-1. > > > > > > PS When "TeX" moved from the ascii encoding to iso-8859-1 > > > and the so called Cork encoding, "they" know this and provided > > > all the complementary packages to circumvent this. It was > > > in 199? (Python was not even born). > > > > > > Ditto for the foundries (Adobe, Linotype, ...) > > > > > > I don't understand what any of this has to do with Python. Just > > output your text in UTF-8 like any civilized person in the 21st > > century, and none of that is a problem at all. Python make that easy. > > It also makes it easy to interoperate with older encodings if you > > have to. >
Sorry, you missed the point. My comment had nothing to do with the code source coding, the coding of a Python "string" in the code source or with the display of a Python3 <str>. I wrote about the *internal* Python "coding", the way Python keeps "strings" in memory. See PEP 393. jmf -- http://mail.python.org/mailman/listinfo/python-list