On Sun, 04 Nov 2012 01:14:29 +0000, Oscar Benjamin wrote: > On 3 November 2012 22:50, Chris Angelico <ros...@gmail.com> wrote: >> This one I haven't checked the source for, but ISTR discussions on this >> list about comparison of two unequal interned strings not being >> optimized, so they'll end up being compared char-for-char. Using 'is' >> guarantees that the check stops with identity. This may or may not be >> significant, and as you say, defending against an uninterned string >> slipping through is potentially critical. > > The source is here (and it shows what you suggest): > http://hg.python.org/cpython/file/6c639a1ff53d/Objects/ unicodeobject.c#l6128
I don't think it does, although I could be wrong, I find reading C to be quite difficult. The unicode_compare function compares character by character, true, but it doesn't get called directly. The public interface is PyUnicode_Compare, which includes this test before calling unicode_compare: /* Shortcut for empty or interned objects */ if (v == u) { Py_DECREF(u); Py_DECREF(v); return 0; } result = unicode_compare(u, v); where v and u are pointers to the unicode object. So it appears that the test for strings being equal length have been dropped, but the identity test is still present. > Comparing strings char for char is really not that big a deal though. Depends on how big the string and where the first difference is. > This has been discussed before: you don't need to compare very many > characters to conclude that strings are unequal (if I remember correctly > you were part of that discussion). On average. Worst case, you have to look at every character. -- Steven -- http://mail.python.org/mailman/listinfo/python-list