Jim Jewett <jimjjew...@gmail.com> added the comment:

To be more explicit about Martin A. Lemburg's msg151121 (which I agree with):

Count the collisions on a single lookup. 
If they exceed a threshhold, do something different.

Martin's strawman proposal was threshhold=1000, and raise.  It would be just as 
easy to say "whoa!  5 collisions -- time to use the alternative hash instead" 
(and, possibly, to issue a warning).  

Even that slight tuning removes the biggest objection, because it won't ever 
actually fail.

Note that the use of a (presumably stronger 2nd) hash wouldn't come into play 
until (and unless) there was a problem for that specific key in that specific 
dictionary.  For the normal case, nothing changes -- unless we take advantage 
of the existence of a 2nd hash to simplify the first few rounds of collision 
resolution.  (Linear probing is more cache-friendly, but also more vulnerable 
to worst-case behavior -- but if probing stops at 4 or 8, that may not matter 
much.)  For quick scripts, the 2nd hash will almost certainly never be needed, 
so startup won't pay the penalty.

The only down side I see is that the 2nd (presumably randomized) hash won't be 
cached without another slot, which takes more memory and shouldn't be done in a 
bugfix release.

----------
nosy: +Jim.Jewett

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue13703>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to