On Fri, Nov 21, 2014 at 12:59 AM, <random...@fastmail.us> wrote: > On Thu, Nov 20, 2014, at 07:35, Peter Otten wrote: >> >>> "%s nötig %s" % (u"üblich", u"ähnlich") >> Traceback (most recent call last): >> File "<stdin>", line 1, in <module> >> UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 4: >> ordinal not in range(128) > > This is surprising to me - why is it trying to decode the format string, > rather than encode the arguments?
Why should it encode to bytes? Makes much better sense to work in Unicode. But mainly, it has to do one of them, and be predictable. If you add a float and an int, you have to predictably get back one of those two types, and since neither is a perfect superset of the other, one just has to be picked. (And that's float, because it's more likely to be the better option.) In this case, picking Unicode to meet on is easily the better option, because you'll often have pure-ASCII string literals as format strings, and Unicode data being interpolated into it. So using an ASCII codec is far more likely to succeed if you decode the format string than if you encode the data. Personally, I'd much rather be very clear about what's text and what's bytes, and not have any automatic encoding at all. That's why I use Python 3. ChrisA -- https://mail.python.org/mailman/listinfo/python-list