New submission from Daniel Fleischman <danielfleisch...@gmail.com>:
The following code takes quadratic time on the size of the dictionary passed, regardless of the dictionary (explanation below): ``` def slow_dictionary(d): while len(d) > 0: # Remove first element key = next(iter(d)) del d[key] ``` The problem is that when an element is deleted a NULL/NULL placeholder is set in its place (https://github.com/python/cpython/blob/818628c2da99ba0376313971816d472c65c9a9fc/Objects/dictobject.c#L1534) and when we try to find the first element with `next(iter(d))` the code needs to skip over all the NULL elements until it finds the first non-NULL element (https://github.com/python/cpython/blob/818628c2da99ba0376313971816d472c65c9a9fc/Objects/dictobject.c#L1713). I'm not sure of what is the best way to fix it, but note that simply adding a field to the struct with the position of the first non-NULL element is not enough, since a code that always deletes the SECOND element of the dictionary would still have linear operations. An easy (but memory-wasteful) fix would be to augment the struct PyDictKeyEntry with the indices of the next/previous non empty elements, and augment _dictkeysobject with the index of the first and last non empty elements (in other words, maintain an underlying linked list of the non empty entries). With this we can always iterate in O(1) per entry. (I tested it only on version 3.9.2, but I would be surprised if it doesn't happen in other versions as well). ---------- components: Interpreter Core messages: 396880 nosy: danielfleischman priority: normal severity: normal status: open title: Dictionary operations are LINEAR for any dictionary (for a particular code). type: performance versions: Python 3.10, Python 3.11, Python 3.6, Python 3.7, Python 3.8, Python 3.9 _______________________________________ Python tracker <rep...@bugs.python.org> <https://bugs.python.org/issue44555> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com