That's right, if the test is exact it is not possible to get a p-value of zero. wilcox.test does not provide an exact p-value in the presence of ties so if there are any ties in your data you are getting a normal approximation. Incidentally, if there are any ties in your data set I would strongly recommend computing the *exact* p-value because using the normal approximation on tied data sets will either inflate type I error rate or reduce power depending on how the ties are distributed. Depending on the pattern of ties this can result in gross under or over estimation of the p-value.
I guess this is all by way of saying that you should always compute the exact p-value if possible. The package exactRankTests uses the algorithm by Mehta Patel and Tsiatis (1984). If your sample sizes are larger, there is a freely available .exe by Cheung and Klotz (1995) that will do exact p-values for sample sizes larger than 100 in each group! You can find it at http://pages.cs.wisc.edu/~klotz/ Bryan > Hi Murat, > I am not an expert in either statistics nor R, but I can imagine that since > the > default is exact=TRUE, It numerically computes the probability, and it may > indeed be 0. if you use wilcox.test(x, y, exact=FALSE) it will give you a > normal aproximation, which will most likely be different from zero. No, the exact p-value can't be zero for a discrete distribution. The smallest possible value in this case would, I think, be 1/choose(length(x)+length(y),length(x)), or perhaps twice that. More generally, the approach used by format.pvalue() is to display very small p-values as <2e-16, where 2e-16 is machine epsilon. I wouldn't want to claim optimality for this choice, but it seems a reasonable way to represent "very small". -thomas > Hope this helps. > Keo. > > Murat Tasan escribi?: >> hi, folks, >> >> how have you gone about reporting a p-value from a test when the >> returned value from a test (in this case a rank-sum test) is >> numerically equal to 0 according to the machine? >> >> the next lowest value greater than zero that is distinct from zero on >> the machine is likely algorithm-dependent (the algorithm of the test >> itself), but without knowing the explicit steps of the algorithm >> implementation, it is difficult to provide any non-zero value. i >> initially thought to look at .mach...@double.xmin, but i'm not >> comfortable with reporting p < .mach...@double.xmin, since without >> knowing the specifics of the implementation, this may not be true! >> >> to be clear, if i have data x, and i run the following line, the >> returned value is TRUE. >> >> wilcox.test(x)$p.value == 0 >> >> thanks for any help on this! >> >> ______________________________________________ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > Thomas Lumley Assoc. Professor, Biostatistics tlum...@u.washington.edu University of Washington, Seattle ------------------------------ ------------- Bryan Keller, Doctoral Student/Project Assistant Educational Psychology - Quantitative Methods The University of Wisconsin - Madison ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.