New submission from Gregory P. Smith <[EMAIL PROTECTED]>: The attached script simply loops building a list of tuples. It has horrible performance as the list gets larger compared to something appending simple objects like ints to the list.
% python tuple_gc_hell.py ~ ... 1000000 1.3615500927 2000000 2.95893096924 3000000 4.53531980515 4000000 5.62795209885 5000000 7.70974612236 ... The time it takes to execute grows linearly with the number of tuples already appended to the list. Turning on gc.debug(gc.DEBUG_STATS) shows why as does running python under a profiler: % cumulative self self total time seconds seconds calls s/call s/call name 27.85 115.84 115.84 14270 0.01 0.02 collect 21.19 204.01 88.17 1115120314 0.00 0.00 tupletraverse 13.14 258.65 54.64 1655624731 0.00 0.00 visit_reachable 10.72 303.25 44.60 1655624731 0.00 0.00 visit_decref 5.97 328.10 24.85 338 0.07 1.10 PyEval_EvalFrame It appears the cyclic gc is rechecking all of these tuples repeatedly. disabling gc during the loop or using a very high gc.set_threshold hides the problem. ---------- components: Interpreter Core files: tuple_gc_hell.py messages: 74512 nosy: gregory.p.smith priority: normal severity: normal status: open title: Building a list of tuples has non-linear performance type: performance versions: Python 2.5, Python 2.6, Python 2.7 Added file: http://bugs.python.org/file11743/tuple_gc_hell.py _______________________________________ Python tracker <[EMAIL PROTECTED]> <http://bugs.python.org/issue4074> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com