New submission from edk <e...@kellett.im>:
The C implementation of sum() contains this comment: /* It's tempting to use PyNumber_InPlaceAdd instead of PyNumber_Add here, to avoid quadratic running time when doing 'sum(list_of_lists, [])'. However, this would produce a change in behaviour: a snippet like empty = [] sum([[x] for x in range(10)], empty) would change the value of empty. */ But that doesn't hold after PyNumber_Add has been called once, because from that point onward the accumulator value is something we got from an __add__, and the caller can't know if we reuse it. The in-place version is substantially faster for some workloads-- the example in the comment is an obvious one, but the context in which I ran into this was using a sum of Counters in a one-liner a bit like this: sum((Counter({line.split("|", 3): len(line)}) for line in sys.stdin), Counter()) in which significant time seems to be spent adding the contents of the previous accumulator value to the new one. # before ; ./python -m timeit 'sum(([x] for x in range(1000)), [])' 500 loops, best of 5: 888 usec per loop # after ; ./python -m timeit 'sum(([x] for x in range(1000)), [])' 5000 loops, best of 5: 65.3 usec per loop ; cat test_.py from collections import Counter import timeit import random data = [Counter({random.choice(['foo', 'bar', 'baz', 'qux']): random.randint(1,1000000)}) for _ in range(10000)] print(min(timeit.repeat('sum(data, Counter())', 'from collections import Counter', number=100, globals={'data': data}))) print(min(timeit.repeat('reduce(Counter.__iadd__, data, Counter())', 'from collections import Counter; from functools import reduce', number=100, globals={'data': data}))) # before ; ./python test_.py 1.8981186050223187 0.7094596439856105 # after ; ./python test_.py 0.715508968976792 0.7050370009965263 ---------- messages: 360583 nosy: edk priority: normal severity: normal status: open title: Use PyNumber_InPlaceAdd in sum() for the second iteration onward type: performance _______________________________________ Python tracker <rep...@bugs.python.org> <https://bugs.python.org/issue39440> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com