In article <[EMAIL PROTECTED]>, "John Machin" <[EMAIL PROTECTED]> wrote:
> You don't need to think. This problem has been extensively picked over > by folk who are a lot smarter than us, starting from 30 years ago. > Google for "disjoint set" and "union-find". One gets the impression > that the best possible algorithm would be slightly *worse* than O(n). It can be solved by union-find (e.g. with UnionFind from <http://www.ics.uci.edu/~eppstein/PADS/>): import UnionFind import sets def merge(pairs): uf = UnionFind.UnionFind() items = sets.Set() for a,b in pairs: uf.union(a,b) items.add(a) items.add(b) leaders = {} for a in items: leaders.setdefault(uf[a],sets.Set()).add(a) return [list(component) for component in leaders.values()] If you do all the unions before all the finds, as in this algorithm, union-find is O(n). However it might be easier to treat the input pairs as the edges of a graph and apply a depth-first-search based graph connected components algorithm. -- David Eppstein Computer Science Dept., Univ. of California, Irvine http://www.ics.uci.edu/~eppstein/ -- http://mail.python.org/mailman/listinfo/python-list