* Astley Le Jasper, on 20.09.2010 23:42:
I have a list of tuples that indicate a relationship, ie a is related
to b, b is related to c etc etc. What I want to do is cluster these
relationships into groups.  An item will only be associated with a
single cluster.

Before I started, I wondered if there was any particular tool within
Python I should be looking at. I don't expect anyone to code this for
me, just say ... "you need to look at using x". I was going to use
populate a dictionary and

Sorry for being so vague.

Example Data:
[(a,b)
(a,c)
(a,d)
(b,c)
(b,d)
(c,d)
(e,f)
(e,g)
(f,g)
(h,i)]

Output (grouping id, item ref)
(1,a),
(1,b),
(1,c),
(1,d),
(2,e),
(2,f),
(2,g),
(3,h),
(3,i)

It seems to be the same problem as "equivalence sets".

This problem was solved early on because e.g. Fortran compilers had to construct such sets (equivalence partitions of a set).

I though I'd just say "Google it!", because I know there's a standard algorithm but I can't recall the name. However, googling it I found no mention of that algorithm. Not even in the Wikipedia articles on equivalence sets.

A number of approaches spring to mind:

  Approach 1:
  Multi-pass. Originally you assign a distinct set number to each symbol.
  In each pass through the symbols you replace one number with another as
  per one of the equivalence specs. Stop when no replacements are done.

  Approach 2:
  Merging. For each new equivalence A=B you check whether A has been assigned
  to a set, if not you create a new one, call that S1, and ditto for B, S2.
  Merge S1 and S2, e.g. move all elements of S2 to S1.

  Approach 3:
  In a first pass convert the data to more explicit form, linking each symbol
  to the symbols it's directly equivalent to. Then in second pass simply drag
  out each equivalence group (recurse on each symbol's list of equivalences).

Approaches 1 and 2 seem to be pretty inefficient for a large number of symbols, but I think approach 1 may be a practical option for a small number of symbols.


Cheers & hth.,

- Alf

--
blog at <url: http://alfps.wordpress.com>
--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to