Stefan, I am sorry that I wasn’t more careful writing my question. I misrepresented my problem when I wrote “the sum of the probabilities in lpvec should be <=1, but it is not. The sum is something on the order of 1.48e-13.” It is absolutely right that lpvec doesn’t contain probabilities, it contains log-probabilities that are negative numbers. Also, clearly, the value of 1.48e-13 is <=1! What I meant is that I wanted to add probabilities in the log-probability space and that I expected a negative value as a result. Instead, I am getting a positive value.
You are also right that if I wanted to test for a positive association I should add the probabilities of k>=J to obtain a p-value. This is clear to me, but I want a metric of the strength of association between animals that takes higher values when the association is stronger. A final result in the log-probability space is useful because I want to use the strength of association to build networks of individuals. Finally, your idea of using Fisher’s exact test is very appealing for the simplicity and availability of a working function in R, but I am having trouble interpreting the result, or the result is different from that of my formula. Here’s a simple example: Say N=2, N1=1, N2=1, and J=1. In this case, if the individuals are completely independent of each other, I expect to see them together with a probability of 0.5. That is the output from my function, in probability space. If I run Fisher’s exact test on the corresponding contingency table, however, I get a p-value of 1. This is what I did: fisher.test(rbind(c(J,N1-J),c(N2-J,N-N1-N2+J))) Why should this be, when the Fisher’s exact test is giving me the probability of obtaining the observed arrangement under the null hypothesis of no association between the animals. Am I interpreting the output incorrectly. Thank you for answer, it is great to find an even simpler way of addressing the problem. Best, Gonçalo > On Jun 26, 2016, at 9:35 AM, Stefan Evert <stefa...@collocations.de> wrote: > > Why do you want to do this? Why not simply use Fisher's exact test? > > N <- 2178 > N1 <- 165 > N2 <- 331 > J <- 97 > ct <- rbind(c(J, N1-J), c(N2-J, N-N1-N2+J)) > fisher.test(ct) > > Background explanation: > > - Your formula computes the log hypergeometric probability for a contingency > table as ct above, but with k instead of J. > > - It does so in an unnecessarily complicated way: three terms would be enough > (cf. the equation at http://www.collocations.de/AM/section3.html; with C1=N1, > C2=N-C1, R1=N2). > > - If you want to test for a positive association between the two animals, you > should be adding up the probabilities for k >= J to obtain a p-value, rather > than k <= J (what would this sum of probabilities tell you?). > > - lpvec doesn't contain probabilities, but log probabilities. What sense > would there be in adding those up? In any case, you should obtain a negative > value because all the individual logs are negative. > > Best, > Stefan > > > >> On 25 Jun 2016, at 16:13, Gonçalo Ferraz <gferra...@gmail.com> wrote: >> >> I am working on interactions between animals, studying whether animal 1 is >> attracted to animal 2 (or vice-versa). I looked for the two animals in >> N=2178 sampling occasions, finding animal 1 a total of N1=165 times, and >> animal 2 a total of N2=331 times. In J=97 occasions, I saw both animals at >> the same time. >> >> The more frequently I see the two animals in the same sampling occasion, the >> more I will believe that they are attracted to each other. Therefore, I want >> to calculate the probability of finding J<=97 when the two animals are >> moving around independently of each other. The higher this probability, the >> stronger the attraction. >> >> Following Veech (Journal of Biogeography 2014, 41: 1029-1035) I compute the >> log probability of obtaining a number n of encounters between animals as >> ‘lpn’ in the function lveech: >> > ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.