[issue40889] Symmetric difference on dict_views is inefficient

Raymond Hettinger Sat, 06 Jun 2020 10:11:19 -0700


New submission from Raymond Hettinger <raymond.hettin...@gmail.com>:


Running "d1.items() ^ d2.items()" will rehash every key and value in both 
dictionaries regardless of how much they overlap.

By taking advantage of the known hashes, the analysis step could avoid making 
any calls to __hash__().  Only the result tuples would need to hashed.

Currently the code below calls hash for every key and value on the left and for 
every key and value on the right:

  >>> left = {1: -1, 2: -2, 3: -3, 4: -4, 5: -5, 6: -6, 7: -7}
  >>> right = {1: -1, 2: -2, 3: -3, 4: -4, 5: -5, 8: -8, 9: -9}
  >>> left.items() ^ right.items()        # Total work: 28 __hash__() calls
  {(6, -6), (7, -7), (8, -8), (9, -9)}

Compare that with the workload for set symmetric difference which makes zero 
calls to __hash__():

  >>> set(left) ^ set(right)
  {6, 7, 8, 9}

FWIW, I do have an important use case where this matters.

----------
components: Interpreter Core
messages: 370839
nosy: rhettinger, serhiy.storchaka
priority: normal
severity: normal
status: open
title: Symmetric difference on dict_views is inefficient
type: performance
versions: Python 3.10

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue40889>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue40889] Symmetric difference on dict_views is inefficient

Reply via email to