On Tue, Jan 3, 2012 at 5:59 PM, Peter Otten <__pete...@web.de> wrote: > Benoit Thiell wrote: > >> I am experiencing a puzzling problem with both Python 2.4 and Python >> 2.6 on CentOS 5. I'm looking for an explanation of the problem and >> possible solutions. Here is what I did: >> >> Python 2.4.3 (#1, Sep 21 2011, 19:55:41) >> IPython 0.8.4 -- An enhanced Interactive Python. >> >> In [1]: def test(): >> ...: return [(i,) for i in range(10**6)] >> >> In [2]: %time x = test() >> CPU times: user 0.82 s, sys: 0.04 s, total: 0.86 s >> Wall time: 0.86 s >> >> In [4]: big_list = range(50 * 10**6) >> >> In [5]: %time y = test() >> CPU times: user 9.11 s, sys: 0.03 s, total: 9.14 s >> Wall time: 9.15 s >> >> As you can see, after creating a list of 50 million integers, creating >> the same list of 1 million tuples takes about 10 times longer than the >> first time. >> >> I ran these tests on a machine with 144GB of memory and it is not >> swapping. Before creating the big list of integers, IPython used 111MB >> of memory; After the creation, it used 1664MB of memory. > > In older Pythons the heuristic used to decide when to run the cyclic garbage > collection is not well suited for the creation of many objects in a row. > Try switching it off temporarily with > > import gc > gc.disable() > # create many objects that are here to stay > gc.enable() > > You may also encorporate that into your test function: > > def test(): > gc.disable() > try: > return [...] > finally: > gc.enable()
Thanks Peter, this is very helpful. Modifying my test according to your directions produced much more consistent results. Benoit. -- Benoit Thiell The SAO/NASA Astrophysics Data System http://adswww.harvard.edu/ -- http://mail.python.org/mailman/listinfo/python-list