On Sun, 25 Nov 2007 01:38:51 +0100, Hrvoje Niksic wrote: > samwyse <[EMAIL PROTECTED]> writes: > >> create a hash that maps your keys to themselves, then use the values of >> that hash as your keys. > > The "atom" function you describe already exists under the name "intern".
Not really. intern() works very differently, because it can tie itself to the Python internals. Samwyse's atom() function doesn't, and so has no purpose. In any case, I'm not sure that intern() actually will solve the OP's problem, even assuming it is a real and not imaginary problem. According to the docs, intern()'s purpose is to speed up dictionary lookups, not to save memory. I suspect that if it does save memory, it will be by accident. >From the docs: http://docs.python.org/lib/non-essential-built-in-funcs.html intern( string) Enter string in the table of ``interned'' strings and return the interned string - which is string itself or a copy. Interning strings is useful to gain a little performance on dictionary lookup - if the keys in a dictionary are interned, and the lookup key is interned, the key comparisons (after hashing) can be done by a pointer compare instead of a string compare. Normally, the names used in Python programs are automatically interned, and the dictionaries used to hold module, class or instance attributes have interned keys. Changed in version 2.3: Interned strings are not immortal (like they used to be in Python 2.2 and before); you must keep a reference to the return value of intern() around to benefit from it. Note the words "which is string itself or a copy". It would be ironic if the OP uses intern to avoid having copies of strings, and ends up with even more copies than if he didn't bother. I guess he'll actually need to measure his memory consumption and see whether he actually has a memory problem or not, right? -- Steven. -- http://mail.python.org/mailman/listinfo/python-list