Benoit Thiell wrote: > I am experiencing a puzzling problem with both Python 2.4 and Python > 2.6 on CentOS 5. I'm looking for an explanation of the problem and > possible solutions. Here is what I did: > > Python 2.4.3 (#1, Sep 21 2011, 19:55:41) > IPython 0.8.4 -- An enhanced Interactive Python. > > In [1]: def test(): > ...: return [(i,) for i in range(10**6)] > > In [2]: %time x = test() > CPU times: user 0.82 s, sys: 0.04 s, total: 0.86 s > Wall time: 0.86 s > > In [4]: big_list = range(50 * 10**6) > > In [5]: %time y = test() > CPU times: user 9.11 s, sys: 0.03 s, total: 9.14 s > Wall time: 9.15 s > > As you can see, after creating a list of 50 million integers, creating > the same list of 1 million tuples takes about 10 times longer than the > first time. > > I ran these tests on a machine with 144GB of memory and it is not > swapping. Before creating the big list of integers, IPython used 111MB > of memory; After the creation, it used 1664MB of memory.
In older Pythons the heuristic used to decide when to run the cyclic garbage collection is not well suited for the creation of many objects in a row. Try switching it off temporarily with import gc gc.disable() # create many objects that are here to stay gc.enable() You may also encorporate that into your test function: def test(): gc.disable() try: return [...] finally: gc.enable() -- http://mail.python.org/mailman/listinfo/python-list