On 2007-11-16, Robin Becker <[EMAIL PROTECTED]> wrote: > I'm trying to get my head round a timing puzzle I find whilst > optimizing A Kuchling's version of the Knuth line splitting > algorithm from the oedipus project. The puzzle is as follows in > highly abstract form (here active_nodes is a list of objects > kept in a sorted order, but there are no special methods on the > objects for comparison etc) > > for i in xrange(m): > ....... > for A in active_nodes[:]: > ...... > if cond: > active_nodes.remove(A) > ...... > ..... > > is slow my test takes 2.7 seconds; profiling reveals we're > calling A.remove thousands of times and it's taking a lot of > time. There appear to be no other usages of active_nodes in the > for A loop. > > On a hunch I changed the above to > > removed = [] > aremoved = removed.append > for i in xrange(m): > ....... > for iA,A in enumerate(active_nodes): > ...... > if cond: > aremoved(iA) > ...... > if removed:
'if removed' is reduntant, as the loop body below will not be executed if removed is empty. > for iA in reversed(removed): > del active_nodes[iA] > del removed[:] > ...... > > this latter is 5 times faster, but I can't really see why. You are no longer making m copies of active_nodes. > My only guess is that appending to the removed list is > relatively easy compared with deletes and the reverse order > delete is somehow moving less data about. I think in the > original Knuth the deletion would have been from some real kind > of list and O(1) whereas here deletions are O(n/2) on average. > On the other hand we have to keep this list in sorted order. > What data structure should I be using? When you have to make many deletions from the middle of a sequence, you would normally choose a linked list. Python doesn't provide much support for linked lists, unfortunately. Instead, filter your list. It looks like you can't use filter directly, so just do it manually. for i in xrange(m): ....... saved_nodes = [] for A in active_nodes[:]: ...... if not cond: saved_nodes.append(A) ...... active_nodes = saved_nodes ..... -- Neil Cerutti Sermon Outline: I. Delineate your fear II. Disown your fear III. Displace your rear --Church Bulletin Blooper -- http://mail.python.org/mailman/listinfo/python-list