New submission from Mark Dickinson <dicki...@gmail.com>: We're seeing UnicodeDecodeErrors on Windows / Python 2.7 when using logging on a Japanese customer machine. The cause turns out to be in the log record formatting, where unicode fields are combined with non-decodable bytestrings coming from strftime.
More details: we were using the following formatter: _LOG_FORMATTER = logging.Formatter( "%(asctime)s %(levelname)s:%(name)s:%(message)s", datefmt="%Y-%m-%dT%H:%M:%S%Z", ) In the logging internals, that `datefmt` gets passed to `time.strftime`, which on the machine in question produced a non-ASCII bytestring (i.e., type `str`). When combined with other Unicode strings in the log record, this gave the `UnicodeDecodeError`. I'm unfortunately failing to reproduce this directly on my own macOS / UK locale machine, but it's documented that `time.strftime` returns a value encoded according to the current locale. In this particular case, the output we were getting from the `time.strftime` call looked like: "2018-12-06T23:57:14\x93\x8c\x8b\x9e (\x95W\x8f\x80\x8e\x9e)" which assuming an encoding of cp932 decodes to something plausible: >>> s.decode("cp932") u'2018-12-06T23:57:14\u6771\u4eac (\u6a19\u6e96\u6642)' >>> print(s.decode("cp932")) 2018-12-06T23:57:14東京 (標準時) It looks as though the logging module should be explicitly decoding the strftime output before doing formatting, using for example what's recommended in the strftime documentation [1]: strftime(<myformat>).decode(locale.getlocale()[1]) Code links: this is the line that's producing non-decodable bytes: https://github.com/python/cpython/blob/49cedc51a68b4cd2525c14ab02bd1a483d8be389/Lib/logging/__init__.py#L425 ... and this is the formatting operation that then ends up raising UnicodeDecodeError as a result of those: https://github.com/python/cpython/blob/49cedc51a68b4cd2525c14ab02bd1a483d8be389/Lib/logging/__init__.py#L469 This isn't an issue on Python 3, and I was unable to reproduce it on my non-Windows machine; that particular form of strftime output may well be specific to Windows (or possibly even specific to Japanese flavours of Windows). [1] https://docs.python.org/2/library/time.html#time.strftime ---------- components: Library (Lib) messages: 331238 nosy: mark.dickinson priority: normal severity: normal status: open title: logging UnicodeDecodeError from undecodable strftime output type: behavior versions: Python 2.7 _______________________________________ Python tracker <rep...@bugs.python.org> <https://bugs.python.org/issue35427> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com