Ezio Melotti <[EMAIL PROTECTED]> added the comment: Usually, when you do operations involving unicode and normal strings, the latter are coerced to unicode using the default encoding. If the codec is not able to decode the string a UnicodeDecodeError is raised. E.g.: >>> 'à' + u'foo' Traceback (most recent call last): File "<stdin>", line 1, in ? UnicodeDecodeError: 'ascii' codec can't decode byte 0x85 in position 0: ordinal not in range(128) The same error is raised with u'%s' % 'à'.
I think that 'à' in u'foo' should behave in the same way (i.e. try to decode the string and possibly raise a UnicodeDecodeError). This is probably the most coherent and backward-compatible solution, at least in Python2.x. In Python2.x normal and unicode strings are often mixed and having 'f' in u'foo' that raises a TypeError will probably break lot of code. In Python3.x it could make sense, the strings are unicode by default and you are not supposed to mix byte strings and unicode strings so we may require an explicit decoding. The behavior should be consistent for all the operations, if we decide to raise a TypeError with 'in' it should be raised with '+' and '%' (and possibly others) as well. _______________________________________ Python tracker <[EMAIL PROTECTED]> <http://bugs.python.org/issue4328> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com