Bugs item #1314107, was opened at 2005-10-05 11:11 Message generated for change (Settings changed) made by tungwaiyip You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1314107&group_id=5470
Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Unicode Group: Python 2.4 >Status: Open Resolution: Fixed Priority: 5 Submitted By: Wai Yip Tung (tungwaiyip) Assigned to: Vinay Sajip (vsajip) Summary: Issue in unicode args in logging Initial Comment: logging has an issue in handling unicode object arguments. >>> import logging >>> >>> class Obj: ... def __init__(self,name): ... self.name = name ... def __str__(self): ... return self.name ... >>> # a non-ascii string ... >>> obj = Obj(u'\u00f6') >>> >>> # this will cause error ... >>> print '%s' % obj Traceback (most recent call last): File "<stdin>", line 1, in ? UnicodeEncodeError: 'ascii' codec can't encode character u'\xf6' in position 0: ordinal not in range(128) >>> >>> # this will promote to unicode (and the console also happen to be able to display it) ... >>> print u'%s' % obj รถ >>> >>> # this works fine ... # (other than logging makes its own decision to encode in utf8) ... >>> logging.error(u'%s' % obj) ERROR:root:├╢ >>> >>> # THIS IS AN UNEXPECTED PROBLEM!!! ... >>> logging.error(u'%s', obj) Traceback (most recent call last): File "C:\Python24\lib\logging\__init__.py", line 706, in emit msg = self.format(record) File "C:\Python24\lib\logging\__init__.py", line 592, in format return fmt.format(record) File "C:\Python24\lib\logging\__init__.py", line 382, in format record.message = record.getMessage() File "C:\Python24\lib\logging\__init__.py", line 253, in getMessage msg = msg % self.args UnicodeEncodeError: 'ascii' codec can't encode character u'\xf6' in position 0: ordinal not in range(128) >>> >>> # workaround the str() conversion in getMessage() ... >>> logging.error(u'%s-\u00f6', obj) ERROR:root:├╢-├╢ The issue seems to be in LogRecord.getMessage(). It attempts to convert msg to byte string: msg = str(self.msg) I am not sure why ti want to do the conversion. The last example workaround this by making sure msg is not convertible to byte string. ---------------------------------------------------------------------- >Comment By: Wai Yip Tung (tungwaiyip) Date: 2005-10-06 16:16 Message: Logged In: YES user_id=561546 >>To ensure good Unicode support, ensure your messages are either Unicode strings or objects whose __str__() method returns a Unicode string. Then, >>msg = msg % args That's what I am doing already. Let me explain the subtle problem again. 1. print '%s' % obj - error 2. logging.error(u'%s' % obj) - ok 3. logging.error(u'%s', obj) - error 4. logging.error(u'%s-\u00f6', obj) -ok I can understand how 1 fails. But I expect 2,3 and 4 to work similarly. Especially contrast 3 with 4. 4 work when 3 doesn't because when str() is applied to u'%s-\u00f6' it fails and it fallbacks to the original unicode string, which is the correct way in my opinion. Whereas in 3, the u'%s' get demoted to byte string '%s' so it fails like 1. ---------------------------------------------------------------------- Comment By: Vinay Sajip (vsajip) Date: 2005-10-06 01:44 Message: Logged In: YES user_id=308438 Misc. changes were backported into Python 2.4.2, please check that you have this version. The problem is not with msg = str(self.msg) but rather with msg = msg % args To ensure good Unicode support, ensure your messages are either Unicode strings or objects whose __str__() method returns a Unicode string. Then, msg = msg % args should result in a Unicode object. You can pass this to a FileHandler opened with an encoding argument, or a StreamHandler whose stream has been opened using codecs.open(). Ensure your default encoding is set correctly using sitecustomize.py. The encoding additions were made in Revision 1.26 of logging/__init__.py, dated 13/03/2005. Marking as closed. ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2005-10-05 21:00 Message: Logged In: YES user_id=33168 Vinay, any suggestions? ---------------------------------------------------------------------- Comment By: M.-A. Lemburg (lemburg) Date: 2005-10-05 13:47 Message: Logged In: YES user_id=38388 Unassinging the bug. I don't know anything about the logging module. Hint: perhaps the logging module should grow an .encoding attribute which then allows converting Unicode to some encoding used in the log file ?! ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1314107&group_id=5470 _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com