Re: [openstack-dev] [all][python3] use of six.iteritems()

Victor Stinner Thu, 11 Jun 2015 08:07:33 -0700

Hi,

Le 10/06/2015 02:15, Robert Collins a écrit :

python2.7 -m timeit -s 'd=dict(enumerate(range(1000000)))' 'for i in
d.items(): pass'
10 loops, best of 3: 76.6 msec per loop

python2.7 -m timeit -s 'd=dict(enumerate(range(1000000)))' 'for i in
d.iteritems(): pass'
100 loops, best of 3: 22.6 msec per loop

.items() is 3x as slow as .iteritems(). Hum, I don't have the sameresults. Try attached benchmark. I'm using my own wrapper on top oftimeit, because timeit is bad at calibrating the benchmark :-/ timeitgives unreliable results.


Results on with CPU model: Intel(R) Core(TM) i7-3520M CPU @ 2.90GHz:

[ 10 keys ]
713 ns: iteritems
922 ns (+29%): items

[ 10^3 keys ]
42.1 us: iteritems
59.4 us (+41%): items


[ 10^6 keys (1 million) ]
89.3 ms: iteritems
442 ms (+395%): items

In my benchmark, .items() is 5x as slow as .iteritems(). The code toiterate on 1 million items takes almost an half second. IMO adding 300ms to each request is not negligible on an application. If this delay isadded multiple times (multiple loops iterating on 1 million items), wemay reach up to 1 second on an user request :-/

Anyway, when I write patches to port a project to Python 3, I don't wantto touch *anything* to Python 2. The API, the performances, thebehaviour, etc. must not change.

I don't want to be responsible of a slow down, and I don't feel able toestimate if replacing dict.iteritems() with dict.items() has a cost on areal application.

As Ihar wrote: it must be done in a separated patch, by developersknowning well the project.

Currently, most developers writing Python 3 patches are not heavilyinvolved in each ported project.


There is also dict.itervalues(), not only dict.iteritems().

"for key in dict.iterkeys()" can simply be written "for key in dict:".

There is also xrange() vs range(), the debate is similar:
https://review.openstack.org/#/c/185418/

For Python 3, I suggest to use "from six.moves import range" to get thePython 3 behaviour on Python 2: range() always create an iterator, itdoesn't create a temporary list. IMO it makes the code more readablebecause "for i in xrange(n):" becomes "for i in range(n):". six is notwritten outside imports and "range()" is better than "xrange()" fordevelopers starting to learn Python.


Victor

"""
Micro-benchmark for the Python operation "key in dict". Run it with:

./python.orig benchmark.py script bench_str.py --file=orig
./python.patched benchmark.py script bench_str.py --file=patched
./python.patched benchmark.py compare_to orig patched

Download benchmark.py from:

https://bitbucket.org/haypo/misc/raw/tip/python/benchmark.py
"""
import gc

def consume_items(dico):
    for key, value in dico.items():
        pass


def consume_iteritems(dico):
    for key, value in dico.iteritems():
        pass


def run_benchmark(bench):
    for nkeys in (10, 10**3, 10**6):
        bench.start_group('%s keys' % nkeys)
        dico = {str(index): index for index in range(nkeys)}

        bench.compare_functions(
            ('iteritems', consume_iteritems, dico),
            ('items', consume_items, dico),
        )
        dico = None
        gc.collect()
        gc.collect()

if __name__ == "__main__":
    import benchmark
    benchmark.main()

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [all][python3] use of six.iteritems()

Reply via email to