Code: Select all
for i in range(len(IN)): #scan all elements of the list IN
for j in range(len(IN)):
if i <> j:
if IN[i].coordinates[0] == IN[j].coordinates[0]:
if IN[i].coordinates[1] == IN[j].coordinates[1]:
SN.append(IN[i].label)
Unfortunately my len(IN) is about 100.000 and the running time about
15h !!!! :(
Any idea to improve it?
[snip]
I have already tried to group the "if statements" in a single one:
Code: Select all
if i <> j and if IN[i].coordinates[0] == IN[j].coordinates[0] and
if IN[i].coordinates[1] == IN[j].coordinates[1]:
but no improvements.
It's like rearranging deck-chairs on the Titanic :) Yes, it may
give a speed up, but what's 3 seconds when you're waiting 15hr :)
Not knowing the len(IN[x].coordinates) or their structure, if
it's a list of len==2, you should be able to just do
if i <> j and IN[i].coordinates == IN[j].coordinates
or
if i <> j and IN[i].coordinates[:2] == IN[j].coordinates[:2]
However, again, this is just polish. The big problem is that you
have an O(N^2) algorithm that's killing you.
1) use xrange instead of range to save eating memory with a huge
unneeded array.
2) unless you need to append duplicate labels, you know that when
I and J are swapped, you'll reach the same condition again, so it
might be worth writing the outer loops to eliminate this
scenario, and in the process, but just starting at i+1 rather
than i, you can forgo the check if "i<>j".
Such changes might look something like
for i in xrange(len(IN)):
for j in xrange(i+1, len(IN)):
if IN[i].coordinates == IN[j].coordinates:
SN.append(IN[i].label)
If my college algorithms memory serves me sufficiently, this
reduces your O(N^2) to O(N log N) which will garner you some
decent time savings.
-tkc
--
http://mail.python.org/mailman/listinfo/python-list