Stefan,

I am sorry that I wasn’t more careful writing my question. I misrepresented my 
problem when I wrote “the sum of the probabilities in lpvec should be <=1, but 
it is not. The sum is something on the order of 1.48e-13.”
It is absolutely right that lpvec doesn’t contain probabilities, it contains 
log-probabilities that are negative numbers. Also, clearly, the value of 
1.48e-13 is <=1!
What I meant is that I wanted to add probabilities in the log-probability space 
and that I expected a negative value as a result. Instead, I am getting a 
positive value.

You are also right that if I wanted to test for a positive association I should 
add the probabilities of k>=J to obtain a p-value. This is clear to me, but I 
want a metric of the strength of association between animals that takes higher 
values when the association is stronger. A final result in the log-probability 
space is useful because I want to use the strength of association to build 
networks of individuals.

Finally, your idea of using Fisher’s exact test is very appealing for the 
simplicity and availability of a working function in R, but I am having trouble 
interpreting the result, or the result is different from that of my formula. 
Here’s a simple example:

Say N=2, N1=1, N2=1, and J=1. In this case, if the individuals are completely 
independent of each other, I expect to see them together with a probability of 
0.5. That is the output from my function, in probability space. If I run 
Fisher’s exact test on the corresponding contingency table, however, I get a 
p-value of 1. This is what I did:

fisher.test(rbind(c(J,N1-J),c(N2-J,N-N1-N2+J)))

Why should this be, when the Fisher’s exact test is giving me the probability 
of obtaining the observed arrangement under the null hypothesis of no 
association between the animals. Am I interpreting the output incorrectly. 
Thank you for answer, it is great to find an even simpler way of addressing the 
problem.

Best,

Gonçalo

> On Jun 26, 2016, at 9:35 AM, Stefan Evert <stefa...@collocations.de> wrote:
> 
> Why do you want to do this? Why not simply use Fisher's exact test?
> 
> N <- 2178
> N1 <- 165
> N2 <- 331
> J <- 97
> ct <- rbind(c(J, N1-J), c(N2-J, N-N1-N2+J))
> fisher.test(ct)
> 
> Background explanation:
> 
> - Your formula computes the log hypergeometric probability for a contingency 
> table as ct above, but with k instead of J.
> 
> - It does so in an unnecessarily complicated way: three terms would be enough 
> (cf. the equation at http://www.collocations.de/AM/section3.html; with C1=N1, 
> C2=N-C1, R1=N2).
> 
> - If you want to test for a positive association between the two animals, you 
> should be adding up the probabilities for k >= J to obtain a p-value, rather 
> than k <= J (what would this sum of probabilities tell you?).
> 
> - lpvec doesn't contain probabilities, but log probabilities. What sense 
> would there be in adding those up? In any case, you should obtain a negative 
> value because all the individual logs are negative.
> 
> Best,
> Stefan
> 
> 
> 
>> On 25 Jun 2016, at 16:13, Gonçalo Ferraz <gferra...@gmail.com> wrote:
>> 
>> I am working on interactions between animals, studying whether animal 1 is 
>> attracted to animal 2 (or vice-versa). I looked for the two animals in 
>> N=2178 sampling occasions, finding animal 1 a total of N1=165 times, and 
>> animal 2 a total of N2=331 times. In J=97 occasions, I saw both animals at 
>> the same time. 
>> 
>> The more frequently I see the two animals in the same sampling occasion, the 
>> more I will believe that they are attracted to each other. Therefore, I want 
>> to calculate the probability of finding J<=97 when the two animals are 
>> moving around independently of each other. The higher this probability, the 
>> stronger the attraction.
>> 
>> Following Veech (Journal of Biogeography 2014, 41: 1029-1035) I compute the 
>> log probability of obtaining a number n of encounters between animals as 
>> ‘lpn’ in the function lveech:
>> 
> 

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to