In <bf29ddcb-347f-49ca-a1ef-66cd5fae4...@z4g2000prh.googlegroups.com> Jay Bird <jay.bird0...@gmail.com> writes:
>Hi everyone, >I've been trying to figure out a simple algorithm on how to combine a >list of parts that have 1D locations that overlap into a non- >overlapping list. For example, here would be my input: >part name location >a 5-9 >b 7-10 >c 3-6 >d 15-20 >e 18-23 >And here is what I need for an output: >part name location >c.a.b 3-10 >d.e 15-23 >I've tried various methods, which all fail. Does anyone have an idea >how to do this? This problem is basically isomorphic to the problem of finding connected components in an undirected graph. If your parts are nodes in a graph, there's an edge between them if they overlap. The following is probably overkill, but it works. It uses the module pygraph, which you can download from http://code.google.com/p/python-graph/ The real work is done by its connected_components function. import pygraph from pygraph.algorithms.accessibility import connected_components class Part(object): def __init__(self, name, start, finish): self.name = name if start >= finish: raise ValueError("start must be strictly less than finish") self.start = start self.finish = finish def __str__(self): return self.name def combine_parts(parts): gr = pygraph.graph() gr.add_nodes(parts) # connect the nodes for i in range(len(parts) - 1): part_i = parts[i] for j in range(i + 1, len(parts)): part_j = parts[j] if (part_j.start <= part_i.finish <= part_j.finish or part_i.start <= part_j.finish <= part_i.finish): gr.add_edge(part_i, part_j) cc = connected_components(gr) c2n = {} # build connected components as lists for k, v in cc.items(): c2n.setdefault(v, []).append(k) ret = [] for nodes in c2n.values(): nodes.sort(key=lambda x: x.start) name = '.'.join(x.name for x in nodes) start = min(x.start for x in nodes) finish = max(x.finish for x in nodes) ret.append((name, start, finish)) return ret if __name__ == '__main__': data = [Part('a', 5, 9), Part('b', 7, 10), Part('c', 3, 6), Part('d', 15, 20), Part('e', 18, 23)] print combine_parts(data) When you run the script above, the output is [('c.a.b', 3, 10), ('d.e', 15, 23)] HTH, kynn -- http://mail.python.org/mailman/listinfo/python-list