Shafique, M. (UNU-MERIT) wrote: > Hi, > I have a number of different groups g1, g2, … g100 in my data. Each group > is comprised of a known but different set of members from the population > m1, m2, …m1000. The data has been organized in an incidence matrix: > g1g2g3g4g5 > m111101 > m210010 > m301100 > m411011 > m500110 > > I need to count how many groups each possible pair of members share (i.e., > both are member of). > I shall prefer the result in a pairwise edgelist with weight/frequency in > a format like the following: > m1, m1, 4 > m1, m2, 1 > m1, m3, 2 > m1, m4, 3 > m1, m5, 1 > m2, m2, 2 > ... and so on. > > I shall highly appreciate if anybody could suggest/share some > code/tool/module which could help do this.
Homework? What have you tried? One strategy is to create a list of sets containing the groups from the initial matrix matrix = [ [1, 1, 1, 0, 1], [1, 0, 0, 1, 0], ] sets = [ # zero-based indices set([0,1,2,4]), set([0,3]), ... ] The enumerate() builtin may help you with the conversion. You can then find the shared groups with set arithmetic: sets[0] & sets[1] #m1/m2 -- http://mail.python.org/mailman/listinfo/python-list