Re: Pairwise count of frequency from an incidence matrix of group membership

Peter Otten Wed, 20 Apr 2011 01:33:57 -0700

Shafique, M. (UNU-MERIT) wrote:

> Hi,
> I have a number of different groups g1, g2, … g100 in my data. Each group
> is comprised of a known but different set of members from the population
> m1, m2, …m1000. The data has been organized in an incidence matrix:
> g1g2g3g4g5
> m111101
> m210010
> m301100
> m411011
> m500110
> 
> I need to count how many groups each possible pair of members share (i.e.,
> both are member of).
> I shall prefer the result in a pairwise edgelist with weight/frequency in
> a format like the following:
> m1, m1, 4
> m1, m2, 1
> m1, m3, 2
> m1, m4, 3
> m1, m5, 1
> m2, m2, 2
> ... and so on.
> 
> I shall highly appreciate if anybody could suggest/share some
> code/tool/module which could help do this.


Homework? What have you tried?

One strategy is to create a list of sets containing the groups from the 
initial matrix

matrix = [
[1, 1, 1, 0, 1],
[1, 0, 0, 1, 0],
]

sets = [ # zero-based indices
   set([0,1,2,4]),
   set([0,3]),
   ...
]

The enumerate() builtin may help you with the conversion. You can then find 
the shared groups with set arithmetic:

sets[0] & sets[1] #m1/m2


-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Pairwise count of frequency from an incidence matrix of group membership

Reply via email to