Arnaud Delobelle wrote: > Bryan Olson wrote: >> We mean that the party supplying the keys deliberately chose >> them to make the hash table inefficient. In this thread the goal >> is efficiency; a party working toward an opposing goal is an >> adversary. > > There are situations where this can happen I guess
Cormen-Leiserson-Rivest /Introduction to Algorithms/ discusses it in section 12.3.3 (first edition). They suggest a family of hash functions, where an individual hash is chosen at random when creating the table. That still doesn't beat adversaries who can get on-line timing information and figure out when they induce bad cases. More generally, security researchers have started to look at "algorithmic complexity denial of service attacks". http://www.cs.rice.edu/~scrosby/hash/ >> If you find real-world data sets that tend to induce bad-case >> behavior in Python's hash, please do tell. It would be reason >> enough to adjust the hash function. The hashes in popular >> software such as Python are already quite well vetted. > > As Python allows you to make your own hash functions for your own > classes, another danger is that you are dealing with objects with a > bad hash function. I've actually tried it (as I am *not* a pro!) and > FWIW here are the results. > > -------- badhash.py ---------- > > class BadHash(object): > def __hash__(self): > return 1 # that's a bad hash function! The sorting algorithm can have the same problem. With user-defined methods, there's no telling how long __cmp__ will actually take. -- --Bryan -- http://mail.python.org/mailman/listinfo/python-list