Re: [R] create a pairwise coocurrence matrix

David Winsemius Thu, 11 Nov 2010 07:23:19 -0800


On Nov 11, 2010, at 4:44 AM, Stefan Evert wrote:


Pasted and realigned from original posting:

    term1 term2 term3 term4 term5
term1 0 2 0 1 3
term2 2 0 0 1 2
term3 0 0 0 0 0
term4 1 1 0 0 1
term5 3 2 0 1 1
Any ideas on how to do that?

If I understood you correctly, you have this matrix of indicatorvariables for occurrences of terms in documents:
A <- matrix(c(1,1,0,0,1,1,1,0,1,1,1,0,0,0,1), nrow=3, byrow=TRUE,dimnames=list(paste("doc",1:3), paste("term",1:5)))
 A
and want to determine co-occurrence counts for pairs of terms,right? (The formatting of your matrices was messed up, and some ofyour co-occurrence counts don't make sense to me.)
The fastest and easiest solution is

 t(A) %*% A

That is really elegant. (Wish I could remember my linear algebralessons as well from forty years ago.) I checked it against thespecified output and found that with one exception that the OP hadplanned for the diagonal to be filled with zeroes. So that could becompleted by a simple modification:


temp <- t(A) %*% A
diag(temp) <- 0
temp

--
David Winsemius, MD
West Hartford, CT

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] create a pairwise coocurrence matrix

Reply via email to