Re: improving a huge double-for cycle

Tim Chase Thu, 18 Sep 2008 05:59:19 -0700

Code: Select all
    for i in range(len(IN)): #scan all elements of the list IN
      for j in range(len(IN)):
        if i <> j:
         if IN[i].coordinates[0] == IN[j].coordinates[0]:
           if IN[i].coordinates[1] == IN[j].coordinates[1]:
              SN.append(IN[i].label)



Unfortunately my len(IN) is about 100.000 and the running time about
15h !!!! :(

Any idea to improve it?

[snip]

I have already tried to group the "if statements" in a single one:

Code: Select all
    if i <> j and if IN[i].coordinates[0] == IN[j].coordinates[0] and
if IN[i].coordinates[1] == IN[j].coordinates[1]:

but no improvements.

It's like rearranging deck-chairs on the Titanic :) Yes, it maygive a speed up, but what's 3 seconds when you're waiting 15hr :)

Not knowing the len(IN[x].coordinates) or their structure, ifit's a list of len==2, you should be able to just do


  if i <> j and IN[i].coordinates == IN[j].coordinates

or

  if i <> j and IN[i].coordinates[:2] == IN[j].coordinates[:2]

However, again, this is just polish. The big problem is that youhave an O(N^2) algorithm that's killing you.

1) use xrange instead of range to save eating memory with a hugeunneeded array.

2) unless you need to append duplicate labels, you know that whenI and J are swapped, you'll reach the same condition again, so itmight be worth writing the outer loops to eliminate thisscenario, and in the process, but just starting at i+1 ratherthan i, you can forgo the check if "i<>j".


Such changes might look something like

  for i in xrange(len(IN)):
    for j in xrange(i+1, len(IN)):
      if IN[i].coordinates == IN[j].coordinates:
        SN.append(IN[i].label)

If my college algorithms memory serves me sufficiently, thisreduces your O(N^2) to O(N log N) which will garner you somedecent time savings.


-tkc


--
http://mail.python.org/mailman/listinfo/python-list

Re: improving a huge double-for cycle

Reply via email to