Marc-Andre Lemburg <m...@egenix.com> added the comment: Charles-François Natali wrote: > > Anyway, I still think that the hash randomization is the right way to > go, simply because it does solve the problem, whereas the collision > counting doesn't: Martin made a very good point on python-dev with his > database example.
For completeness, I quote Martin here: """ The main issue with that approach is that it allows a new kind of attack. An attacker now needs to find 1000 colliding keys, and submit them one-by-one into a database. The limit will not trigger, as those are just database insertions. Now, if the applications also as a need to read the entire database table into a dictionary, that will suddenly break, and not for the attacker (which would be ok), but for the regular user of the application or the site administrator. So it may be that this approach actually simplifies the attack, making the cure worse than the disease. """ Martin is correct in that it is possible to trick an application into building some data pool which can then be used as indirect input for an attack. What I don't see is what's wrong with the application raising an exception in case it finds such data in an untrusted source (reading arbitrary amounts of user data from a database is just as dangerous as reading such data from any other source). The exception will tell the programmer to be more careful and patch the application not to read untrusted data without additional precautions. It will also tell the maintainer of the application that there was indeed an attack on the system which may need to be tracked down. Note that the collision counting demo patch is trivial - I just wanted to demonstrate how it works. As already mentioned, there's room for improvement: If Python objects were to provide an additional method for calculating a universal hash value (based on an integer input parameter), the dictionary in question could use this to rehash itself and avoid the attack. Think of this as "randomization when needed". (*) Since the dict would still detect the problem, it could also raise a warning to inform the maintainer of the application. So you get the best of both worlds and randomization would only kick in when it's really needed to keep the application running. ---------- _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue13703> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com