New submission from Serhiy Storchaka <storchaka+cpyt...@gmail.com>:
To count the number of items that satisfy certain condition you can use either sum(1 for x in data if pred(x)) or sum(pred(x) for x in data) where pred(x) is a boolean expression. The latter case is shorter but slower. There are two causes for this: 1. The generator expression needs to generate more items, not only when pred(x) is true, but also when pred(x) is false. 2. sum() is optimized for integers and floats, but not for bools. The first cause is out of the scope of this issue, but sum() can optimized for bools. $ ./python -m timeit -s "a = [True] * 10**6" -- "sum(a)" Unpatched: 10 loops, best of 5: 22.3 msec per loop Patched: 50 loops, best of 5: 6.26 msec per loop $ ./python -m timeit -s "a = list(range(10**6))" -- "sum(x % 2 == 0 for x in a)" Unpatched: 5 loops, best of 5: 89.8 msec per loop Patched: 5 loops, best of 5: 67.5 msec per loop $ ./python -m timeit -s "a = list(range(10**6))" -- "sum(1 for x in a if x % 2 == 0)" 5 loops, best of 5: 53.9 msec per loop ---------- components: Interpreter Core messages: 341330 nosy: rhettinger, serhiy.storchaka priority: normal severity: normal status: open title: Optimize sum() for bools type: performance versions: Python 3.8 _______________________________________ Python tracker <rep...@bugs.python.org> <https://bugs.python.org/issue36781> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com