Hi, On Fri, Apr 8, 2011 at 1:52 PM, Bert Gunter <gunter.ber...@gene.com> wrote: > 1. I am not an expert on this.
Definitely me neither, but: > 2. However, my strong prior would be no, since because it is "exact" it has > to calculate all the possible configurations and there are a lot to > calculate with the values of n1 and n2 you gave. But there are situations where one could get away with an approximation given large enough samples (ie. numbers in the contingency table), no? For instance, my "wikipedia-certified statistics course" suggests that with large N, a chisq.test should give "decent" approximation to the pvalue. You can play with that as you like. Also, the function "sage.test" in the "sagenhaft" package uses a "binomial approximation to the Fisher Exact test". A slight modification from its examples: R> library(sagenhaft) R> s <- sage.test(c(0,5,10),c(0,30,50),n1=10000,n2=15000) ## And the fisher.exact equivalents: R> M <- list(matrix(c(0,0,10000-0,15000-0),2,2), matrix(c(5,30,10000-5,15000-30),2,2), matrix(c(10,50,10000-10,15000-50),2,2)) R> m <- sapply(M, function(m) fisher.test(m)$p.value) ## How close are they to each other? R> s - m [1] 0.000000e+00 1.110054e-05 2.916176e-06 You can find the package here: http://www.bioconductor.org/packages/release/bioc/html/sagenhaft.html I guess you (Jim) can judge if it's (i) faster and (ii) appropriate to use in your scenario. Enjoy, -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.