On Apr 12, 7:02 am, [EMAIL PROTECTED] wrote: > I would suggest to configure the default behaviour of the garbage > collector in such a way that this squared complexity is avoided > without requiring specific knowledge and intervention by the user. Not > being an expert in these details I would like to ask the gurus how > this could be done.
Well, the garbage collector activates whenever allocations exceed deallocations by a certain amount, which for obvious reasons is the reason for the bottleneck wh My first stab at a suggestion would be to adjust the threshold depending on how successful it is. So if a garbage collection run collects no objects, double (or whatever) the threshold until the next run. More formulaicly, if your threshold is X, and a garbage collection pass collects Y objects, the the threshold for the next pass would be something like 2*X/(Y/X+1). Then you can create your very large data structure in only O(N log N) time. But as Steve Holden said, it'll probably mess someone else up. There's not much you can do with a garbage collector to make it please everyone. A question often asked--and I am not a big a fan of these sorts of questions, but it is worth thinking about--of people who are creating very large data structures in Python is "Why are you doing that?" That is, you should consider whether some kind of database solution would be better. You mention lots of dicts--it sounds like some balanced B-trees with disk loading on demand could be a good choice. Carl Banks -- http://mail.python.org/mailman/listinfo/python-list