[Thomas Heller] > François Pinard <[EMAIL PROTECTED]> writes: > > [...] given file `question.py' with this contents:
> > # -*- coding: UTF-8 -*- > > texte = unicode("Fran\xe7ois", 'latin1') > > print type(texte), repr(texte), texte > > print type(texte), repr(texte), str(texte) > > doing `python question.py' yields: > > <type 'unicode'> u'Fran\xe7ois' François > > <type 'unicode'> u'Fran\xe7ois' > > Traceback (most recent call last): > > File "question.py", line 4, in ? > > print type(texte), repr(texte), str(texte) > > UnicodeEncodeError: 'ascii' codec can't encode character u'\xe7' \ > > in position 4: ordinal not in range(128) > > [...] why is the first `print' working over its third argument, but > > not the second? How does `print' convert that Unicode string to a > > 8-bit string for output, if not through `str()'? What is missing to > > the documentation, or to my way of understanding it? > AFAIK, print uses sys.stdout.encoding to encode the unicode string. Much thanks for this information. I was not aware of this file attribute. Looking around, I found a quick description in the Library Reference, under "2.3.8 File Objects". However, I did not find in the documentation the rules stating how or when this attribute receives a value, and in particular here, for the case of `sys.stdout'. The Reference Manual, under "6.6 The print statement", is silent about how Unicode strings are handled. Am I looking in the wrong places, or else, should not the standard documentation more handily explain such things? -- François Pinard http://pinard.progiciels-bpi.ca -- http://mail.python.org/mailman/listinfo/python-list