Tim Peters <t...@python.org> added the comment:

>> j is even implies (j ^ -3) == -(j ^ 3)

> This follows from what I posted before: if j is even, then
> j ^ 3 is odd, so we can apply the rule x ^ -2 = -x to x = j ^ 3
> ...

Thanks!  That helps a lot.  I had a blind spot there.  This kind of thing would 
have been best spelled out at the start:  it's not just that -2 has some bad 
effects, but that there's a whole universe of potentially damaging identities:

j odd  implies   j      = -(j ^ -2)
j even implies  (j ^ 1) = -(j ^ -1)
j odd  implies  (j ^ 2) = -(j ^ -4)
j even implies  (j ^ 3) = -(j ^ -3)
j odd  implies  (j ^ 4) = -(j ^ -6)
j even implies  (j ^ 5) = -(j ^ -5)
j odd  implies  (j ^ 6) = -(j ^ -8)
j even implies  (j ^ 7) = -(j ^ -7)
j odd  implies  (j ^ 8) = -(j ^ -10)
j even implies  (j ^ 9) = -(j ^ -9)
j odd  implies  (j ^ 10) = -(j ^ -12)
j even implies  (j ^ 11) = -(j ^ -11)
j odd  implies  (j ^ 12) = -(j ^ -14)
j even implies  (j ^ 13) = -(j ^ -13)
j odd  implies  (j ^ 14) = -(j ^ -16)
j even implies  (j ^ 15) = -(j ^ -15)
...

Every integer pairs up with another of the opposite sign in one of these - but 
with only one.  That goes a long way toward explaining why collisions are 
frequent when mixing signs on ints of similar magnitude, but also why they 
don't "pile up" catastrophically.

But it also suggests a way to construct cases that _are_ catastrophic.  For 
example:

>>> len(set(map(hash, product([-3, 3], repeat=20))))
1024

Ouch!

>>> c = Counter(map(hash, product([-3, 3], repeat=20)))
>>> len(c)
1024
>>> max(c.values())
1024
>>> min(c.values())
1024

So each of the 1024 hash codes showed up 1024 times.

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue34751>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to