Hello,
Thanks for the correction, sorry Murat I was mistaken. Actually your answers solved me a problem I was having using multiple fisher.test() on nucleic acid sequences, where we come up with hundreds of thousands of p values, a lot of which are 0's. Since we have to correct for multiple tests, even very, very small p's might end up not being significant, i had assumed the 0's were tied p values, but now I know I can use the numerator and the denominator to rank the 0's, even if I don't have the exact p value.
Best,
Keo.

Marc Schwartz escribió:
Once one gets past the issue of the p value being extremely small, irrespective of the test being used, the OP has asked the question of how to report it.

Most communities will have standards for how to report p values, covering things like how many significant digits and a minimum p value threshold to report.

For example, in medicine, it is common to report 'small' p values as 'p < 0.001' or 'p < 0.0001'.

Thus, below those numbers, the precision is largely irrelevant and one need not report the actual p value.

I just wanted to be sure that we don't lose sight of the forest for the trees... :-)

The OP should consult a relevant guidance document or an experienced author in the domain of interest.

HTH,

Marc Schwartz


On Sep 16, 2009, at 9:54 AM, Bryan Keller wrote:

That's right, if the test is exact it is not possible to get a p-value of zero. wilcox.test does not provide an exact p-value in the presence of ties so if there are any ties in your data you are getting a normal approximation. Incidentally, if there are any ties in your data set I would strongly recommend computing the *exact* p-value because using the normal approximation on tied data sets will either inflate type I error rate or reduce power depending on how the ties are distributed. Depending on the pattern of ties this can result in gross under or over estimation of the p-value.

I guess this is all by way of saying that you should always compute the exact p-value if possible.

The package exactRankTests uses the algorithm by Mehta Patel and Tsiatis (1984). If your sample sizes are larger, there is a freely available .exe by Cheung and Klotz (1995) that will do exact p-values for sample sizes larger than 100 in each group!

You can find it at http://pages.cs.wisc.edu/~klotz/

Bryan

Hi Murat,
I am not an expert in either statistics nor R, but I can imagine that since the default is exact=TRUE, It numerically computes the probability, and it may indeed be 0. if you use wilcox.test(x, y, exact=FALSE) it will give you a
normal aproximation, which will most likely be different from zero.

No, the exact p-value can't be zero for a discrete distribution. The smallest possible value in this case would, I think, be 1/choose(length(x)+length(y),length(x)), or perhaps twice that.

More generally, the approach used by format.pvalue() is to display very small p-values as <2e-16, where 2e-16 is machine epsilon. I wouldn't want to claim optimality for this choice, but it seems a reasonable way to represent "very small".

    -thomas


Hope this helps.
Keo.

Murat Tasan escribi?:
hi, folks,

how have you gone about reporting a p-value from a test when the
returned value from a test (in this case a rank-sum test) is
numerically equal to 0 according to the machine?

the next lowest value greater than zero that is distinct from zero on
the machine is likely algorithm-dependent (the algorithm of the test
itself), but without knowing the explicit steps of the algorithm
implementation, it is difficult to provide any non-zero value.  i
initially thought to look at .mach...@double.xmin, but i'm not
comfortable with reporting p < .mach...@double.xmin, since without
knowing the specifics of the implementation, this may not be true!

to be clear, if i have data x, and i run the following line, the
returned value is TRUE.

wilcox.test(x)$p.value == 0

thanks for any help on this!

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to