On 2012-09-10, Oscar Benjamin <oscar.j.benja...@gmail.com> wrote: > On 2012-09-10, Chris Angelico <ros...@gmail.com> wrote: >> On Tue, Sep 11, 2012 at 12:06 AM, Oscar Benjamin >><oscar.j.benja...@gmail.com> wrote: >>> On 2012-09-10, Steven D'Aprano <steve+comp.lang.pyt...@pearwood.info> wrote: >>>> What interning buys you is that "s == t" is an O(1) pointer compare if >>>> they are equal. But if s and t differ in the last character, __eq__ will >>>> still inspect every character. There is no way to tell Python "all >>>> strings are interned, if s is not t then s != t as well". >>>> >>> >>> I thought that if *both* strings were interned then a pointer comparison >>> could decide if they were unequal without needing to check the characters. >>> >>> Have I misunderstood how intern() works? >> >> In a language where _all_ strings are guaranteed to be interned (such >> as Lua, I think), you do indeed gain this. Pointer inequality implies >> string inequality. But when interning is optional (as in Python), you >> cannot depend on that, unless there's some way of recognizing interned >> strings. Of course, that may indeed be the case; a simple bit flag >> "this string has been interned" would suffice, and if both strings are >> interned AND their pointers differ, THEN you can be sure the strings >> differ. >> >> I have no idea whether or not CPython version X.Y.Z does this. The >> value of such an optimization really depends on how likely strings are >> to be interned; for instance, if the compiler automatically interns >> all the names of builtins, this could be quite beneficial. Otherwise, >> probably not; most Python scripts don't bother interning anything. >> > > I haven't looked at the source but my understanding was precisely that there > is an intern() bit and that not only the builtins module but all the literals > in any byte-compiled module are interned. >
s/literals/identifiers/ You can see the interned flag in the PyUnicodeObject struct here: http://hg.python.org/cpython/file/3ffd6ad93fe4/Include/unicodeobject.h#l303 Oscar -- http://mail.python.org/mailman/listinfo/python-list