New submission from Raymond Hettinger <raymond.hettin...@gmail.com>:
Running "d1.items() ^ d2.items()" will rehash every key and value in both dictionaries regardless of how much they overlap. By taking advantage of the known hashes, the analysis step could avoid making any calls to __hash__(). Only the result tuples would need to hashed. Currently the code below calls hash for every key and value on the left and for every key and value on the right: >>> left = {1: -1, 2: -2, 3: -3, 4: -4, 5: -5, 6: -6, 7: -7} >>> right = {1: -1, 2: -2, 3: -3, 4: -4, 5: -5, 8: -8, 9: -9} >>> left.items() ^ right.items() # Total work: 28 __hash__() calls {(6, -6), (7, -7), (8, -8), (9, -9)} Compare that with the workload for set symmetric difference which makes zero calls to __hash__(): >>> set(left) ^ set(right) {6, 7, 8, 9} FWIW, I do have an important use case where this matters. ---------- components: Interpreter Core messages: 370839 nosy: rhettinger, serhiy.storchaka priority: normal severity: normal status: open title: Symmetric difference on dict_views is inefficient type: performance versions: Python 3.10 _______________________________________ Python tracker <rep...@bugs.python.org> <https://bugs.python.org/issue40889> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com