Nick Vatamaniuc wrote: > The original post just said that he wanted to index some values by > their urls and didn't really care about the urls themselves, so I > suggested that he just use the hash of the key as the key. The > dictionary will then take a hash of that [note that: > hash(string)=hash(hash(string)) ] then the dictionary will not keep > the reference to the urls and GC will collect them. So instead of > dic[urlstring']=value he will do dic[hash(urlstring)]=value. But then > to retrieve he will have to do old_value=dic[hash(urlstring]].
Note that it is trivial to catch collisions on entry and correct them: key = hash(url_string) i = 0 if d.has_key(key): while d.has_key(key): i += 1 d[key + repr(i)] = val else: d[key] = val or something along those lines. No need to keep the original string. Obviously this is only efficient if collisions are rare. I am a little surprised that hash(hash(s)) == hash(s), is that actually true? Cheers, Terry -- Terry Hancock ([EMAIL PROTECTED]) Anansi Spaceworks http://www.AnansiSpaceworks.com -- http://mail.python.org/mailman/listinfo/python-list