On Sat, Jan 14, 2012 at 3:42 PM, Steven D'Aprano <steve+comp.lang.pyt...@pearwood.info> wrote: > On the Python Dev mailing list, there is a discussion going on about the > stability of the hash function for strings. > > How many people rely on hash(some_string) being stable across Python > versions? Does anyone have code that will be broken if the string hashing > algorithm changes?
On reading your post I immediately thought that you could, if changing algorithm, simultaneously fix the issue of malicious collisions, but that appears to be what you're doing it for primarily :) Suggestion: Create a subclass of dict, the SecureDict or something, which could either perturb the hashes or even use a proper cryptographic hash function; normal dictionaries can continue to use the current algorithm. The description in Objects/dictnotes.txt suggests that it's still well worth keeping the current system for programmer-controlled dictionaries, and only change user-controlled ones (such as POST data etc). It would then be up to the individual framework and module authors to make use of this, but it would not impose any cost on the myriad other uses of dictionaries - there's no point adding extra load to every name lookup just because of a security issue in an extremely narrow situation. It would also mean that code relying on hash(str) stability wouldn't be broken. ChrisA -- http://mail.python.org/mailman/listinfo/python-list