Francis Moreau wrote: > Hello, > > My application is using gettext module to do the translation > stuff. Translated messages are unicode on both python 2 and > 3 (with python2.7 I had to explicitely asked for unicode). > > A problem arises when formatting those messages before logging > them. For example: > > log.debug("%s: %s" % (header, _("will return an unicode string")))
This is only problematic if header is a non-ascii bytestring. > Indeed on python2.7, "%s: %s" is 'str' whereas _() returns > unicode. > > My question is: how should this be fixed properly ? > > A simple solution would be to force all strings passed to the > logger to be unicode: > > log.debug(u"%s: %s" % ...) > > and more generally force all string in my code to be unicode by > using the 'u' prefix. > > or is there a proper solution ? You don't need to change an all-ascii bytestring to unicode. Lo and behold: >>> "%s %s" % (u"üblich", u"ähnlich") u'\xfcblich \xe4hnlich' >>> u"%s %s" % (u"üblich", u"ähnlich") u'\xfcblich \xe4hnlich' Only non-ascii bytestrings mean trouble, either noisy >>> u"%s nötig %s" % (u"üblich", "ähnlich") Traceback (most recent call last): File "<stdin>", line 1, in <module> UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 0: ordinal not in range(128) >>> "%s nötig %s" % (u"üblich", u"ähnlich") Traceback (most recent call last): File "<stdin>", line 1, in <module> UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 4: ordinal not in range(128) or silently until you have to decipher the logfile contents. It's best to stay away from them, and the from __future__ unicode_literals that Chris mentionend is a convenient way to achieve that. -- https://mail.python.org/mailman/listinfo/python-list