That's right, if the test is exact it is not possible to get a p-value of zero. 
 wilcox.test does not provide an exact p-value in the presence of ties so if 
there are any ties in your data you are getting a normal approximation.  
Incidentally, if there are any ties in your data set I would strongly recommend 
computing the *exact* p-value because using the normal approximation on tied 
data sets will either inflate type I error rate or reduce power depending on 
how the ties are distributed.  Depending on the pattern of ties this can result 
in gross under or over estimation of the p-value.

I guess this is all by way of saying that you should always compute the exact 
p-value if possible.

The package exactRankTests uses the algorithm by Mehta Patel and Tsiatis 
(1984).  If your sample sizes are larger, there is a freely available .exe by 
Cheung and Klotz (1995) that will do exact p-values for sample sizes larger 
than 100 in each group!

You can find it at http://pages.cs.wisc.edu/~klotz/

Bryan

> Hi Murat,
> I am not an expert in either statistics nor R, but I can imagine that since 
> the 
> default is exact=TRUE, It numerically computes the probability, and it may 
> indeed be 0. if you use wilcox.test(x, y, exact=FALSE) it will give you a 
> normal aproximation, which will most likely be different from zero.

No, the exact p-value can't be zero for a discrete distribution. The smallest 
possible value in this case would, I think, be 
1/choose(length(x)+length(y),length(x)), or perhaps twice that.

More generally, the approach used by format.pvalue() is to display very small 
p-values as <2e-16, where 2e-16 is machine epsilon.  I wouldn't want to claim 
optimality for this choice, but it seems a reasonable way to represent "very 
small".

     -thomas


> Hope this helps.
> Keo.
>
> Murat Tasan escribi?:
>> hi, folks,
>> 
>> how have you gone about reporting a p-value from a test when the
>> returned value from a test (in this case a rank-sum test) is
>> numerically equal to 0 according to the machine?
>> 
>> the next lowest value greater than zero that is distinct from zero on
>> the machine is likely algorithm-dependent (the algorithm of the test
>> itself), but without knowing the explicit steps of the algorithm
>> implementation, it is difficult to provide any non-zero value.  i
>> initially thought to look at .mach...@double.xmin, but i'm not
>> comfortable with reporting p < .mach...@double.xmin, since without
>> knowing the specifics of the implementation, this may not be true!
>> 
>> to be clear, if i have data x, and i run the following line, the
>> returned value is TRUE.
>> 
>> wilcox.test(x)$p.value == 0
>> 
>> thanks for any help on this!
>> 
>> ______________________________________________
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>> 
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

Thomas Lumley                   Assoc. Professor, Biostatistics
tlum...@u.washington.edu        University of Washington, Seattle



------------------------------

-------------
Bryan Keller, Doctoral Student/Project Assistant
Educational Psychology - Quantitative Methods
The University of Wisconsin - Madison

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to