On Feb 1, 1:28 pm, Arnaud Delobelle <[EMAIL PROTECTED]> wrote: > Yours is O(n^2) and \Omega(n^2). I think mine is O(max(c, nlogn)) > (assuming constant time access for dictionaries, but 'inside' could be > replaced with a list) where c is the number of overlaps. I'll try to > post a proof later (if it's true!). So if we are in a situation where > overlaps are rare, it is in practice O(nlogn).
I have slightly modified overlaps() in order to make the analysis easier (I have made overlaps() consider all intervals open). Here is the new version: def open_interval(i, (a, b)): return (a, 1, i), (b, 0, i) def overlaps(lst, interval=open_interval): bounds = chain(*starmap(interval, enumerate(lst))) # 1 inside = {} # 2 for x, _, i in sorted(bounds): # 3 if inside.pop(i, None) is None: # 4 for j, y in inside.iteritems(): yield i, j # 5 inside[i] = x # 6 'Detailed' analysis of running overlaps(lst) follows, where * lst is of length n * the total number of overlaps in m (assuming dict.__getitem__ and dic.__setitem__ are done in constant time [1]) 1. is a simple loop over lst, takes An 2. is constant (K) 3. The sorted(...) function will take Bnlogn 3,4. This loop is iterated 2n times, with the if condition it will take Cn 5. This loop is iterated once for each overlap exacly. So it will take Dm 6. This statement is executed n times, will take En Altogether the time taken is: K + (A+B+E)n + Bnlogn + Dm So if m is dominated by nlogn, the algorithm is O(nlogn). Otherwise one cannot hope for better thant O(m)! -- Arnaud [1] Otherwise one could use a combination of an array and a doubly linked list to achieve constant time access, deletion and updating of the 'inside' structure. Anyway even if it was log(n) the overall complexity would be the same! -- http://mail.python.org/mailman/listinfo/python-list