New submission from Serhiy Storchaka <storchaka+cpyt...@gmail.com>:

To count the number of items that satisfy certain condition you can use either

    sum(1 for x in data if pred(x))

or

    sum(pred(x) for x in data)

where pred(x) is a boolean expression.

The latter case is shorter but slower. There are two causes for this:

1. The generator expression needs to generate more items, not only when pred(x) 
is true, but also when pred(x) is false.

2. sum() is optimized for integers and floats, but not for bools.

The first cause is out of the scope of this issue, but sum() can optimized for 
bools.

$ ./python -m timeit -s "a = [True] * 10**6" -- "sum(a)"
Unpatched:  10 loops, best of 5: 22.3 msec per loop
Patched:    50 loops, best of 5: 6.26 msec per loop

$ ./python -m timeit -s "a = list(range(10**6))" -- "sum(x % 2 == 0 for x in a)"
Unpatched:  5 loops, best of 5: 89.8 msec per loop
Patched:    5 loops, best of 5: 67.5 msec per loop

$ ./python -m timeit -s "a = list(range(10**6))" -- "sum(1 for x in a if x % 2 
== 0)"
5 loops, best of 5: 53.9 msec per loop

----------
components: Interpreter Core
messages: 341330
nosy: rhettinger, serhiy.storchaka
priority: normal
severity: normal
status: open
title: Optimize sum() for bools
type: performance
versions: Python 3.8

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue36781>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to