On Jul 23, 4:06 am, Naoki INADA <songofaca...@gmail.com> wrote: > In document <http://docs.python.org/library/ > stdtypes.html#file.encoding>: > > >> The encoding that this file uses. When Unicode strings are written to a > >> file, > >> they will be converted to byte strings using this encoding. In addition, > >> when the file is connected to a terminal, the attribute gives the encoding > >> that the terminal is likely to use > > But inlogging.StreamHandler.emit() :: > > try: > if (isinstance(msg, unicode) and > getattr(stream, 'encoding', None)): > #fs = fs.decode(stream.encoding) > try: > stream.write(fs % msg) > except UnicodeEncodeError: > #Printing to terminals sometimes fails. > For example, > #with an encoding of 'cp1251', the above > write will > #work if written to a stream opened or > wrapped by > #the codecs module, but fail when writing > to a > #terminal even when the codepage is set to > cp1251. > #An extra encoding step seems to be > needed. > stream.write((fs % msg).encode > (stream.encoding)) > else: > stream.write(fs % msg) > except UnicodeError: > stream.write(fs % msg.encode("UTF-8")) > > And behavior of sys.stdout in Windows::>>> import sys > >>> sys.stdout.encoding > 'cp932' > >>> u = u"あいう" > >>> u > > u'\u3042\u3044\u3046'>>> print >>sys.stdout, u > あいう > >>> sys.stderr.write(u) > > Traceback (most recent call last): > File "<stdin>", line 1, in <module> > UnicodeEncodeError: 'ascii' codec can't encode characters in position > 0-2: ordinal not in range(128) > > What is file.encoding convention? > If I want to write a unicode string to a file(-like) that have > encoding attribute, I should do > (1) try: file.write(unicode_str), > (2) except UnicodeEncodeError: file.write(unicode_str.encode > (file.encoding)) > likelogging? > It seems agly.
If you are writing a Unicode string to a stream which has been opened with e.g. codecs.open with a specific encoding, then the stream is actually a wrapper. You can write Unicode strings directly to it, and the wrapper stream will encode the Unicode to bytes using the specific encoding and write those bytes to the underlyting stream. In your example you didn't show sys.stderr.encoding - you showed sys.stdout.encoding and printed out something to it which seemed to give the correct result, but then wrote to sys.stderr which gave a UnicodeEncodeError. What is the encoding of sys.stderr in your example? Also note that logging had to handle what appeared to be an oddity with terminals - they (at least sometimes) have an encoding attribute but appear to expect to have bytes written to them, and not Unicode. Hence the logging kludge, which should not be needed and so has been carefully commented. Regards, Vinay Sajip -- http://mail.python.org/mailman/listinfo/python-list