After digging into the R source, it turns out that the argument `exact` has nothing to do with the numeric precision. It only affects the statistic model used to compute the p-value. When `exact=TRUE` the true distribution of the statistic will be used. Otherwise, a normal approximation will be used.
I think the documentation needs to be improved here, you can compute the exact p-value *only* when you do not have any ties in your data. If you have ties in your data you will get the p-value from the normal approximation no matter what value you put in `exact`. This behavior should be documented or a warning should be given when `exact=TRUE` and ties present. FYI, if the exact p-value is required, `pwilcox` function will be used to compute the p-value. There are no details on how it computes the pvalue but its C code seems to compute the probability table, so I assume it computes the exact p-value from the true distribution of the statistic, not a permutation or MC p-value. Best, Jiefei On Fri, Mar 19, 2021 at 10:01 PM Jiefei Wang <szwj...@gmail.com> wrote: > Hey, > > I just want to point out that the word "exact" has two meanings. It can > mean the numerically accurate p-value as Bogdan asked in his first email, > or it could mean the p-value calculated from the exact distribution of the > statistic(In this case, U stat). These two are actually not related, even > though they all called "exact". > > Best, > Jiefei > > On Fri, Mar 19, 2021 at 9:31 PM Spencer Graves < > spencer.gra...@effectivedefense.org> wrote: > >> >> >> On 2021-3-19 12:54 AM, Bogdan Tanasa wrote: >> > thanks a lot, Vivek ! in other words, assuming that we work with 1000 >> data >> > points, >> > >> > shall we use EXACT = TRUE, it uses the normal approximation, >> > >> > while if EXACT=FALSE (for these large samples), it does not ? >> >> >> As David Winsemius noted, the documentation is not clear. >> Consider the following: >> >> > set.seed(1) > x <- rnorm(100) > y <- rnorm(100, 2) > > wilcox.test(x, >> y)$p.value >> [1] 1.172189e-25 > wilcox.test(x, y)$p.value [1] 1.172189e-25 > > >> wilcox.test(x, y, EXACT=TRUE)$p.value [1] 1.172189e-25 > wilcox.test(x, >> y, EXACT=TRUE)$p.value [1] 1.172189e-25 > wilcox.test(x, y, >> exact=TRUE)$p.value [1] 4.123875e-32 > wilcox.test(x, y, >> exact=TRUE)$p.value [1] 4.123875e-32 > > wilcox.test(x, y, >> EXACT=FALSE)$p.value [1] 1.172189e-25 > wilcox.test(x, y, >> EXACT=FALSE)$p.value [1] 1.172189e-25 > wilcox.test(x, y, >> exact=FALSE)$p.value [1] 1.172189e-25 > wilcox.test(x, y, >> exact=FALSE)$p.value [1] 1.172189e-25 > We get two values here: >> 1.172189e-25 and 4.123875e-32. The first one, I think, is the normal >> approximation, which is the same as exact=FALSE. I think that with >> exact=FALSE, you get a permutation distribution, though I'm not sure. >> You might try looking at "wilcox_test in package coin for exact, >> asymptotic and Monte Carlo conditional p-values, including in the >> presence of ties" to see if it is clearer. NOTE: R is case sensitive, so >> "EXACT" is a different variable from "exact". It is interpreted as an >> optional argument, which is not recognized and therefore ignored in this >> context. >> Hope this helps. >> Spencer >> >> >> > On Thu, Mar 18, 2021 at 10:47 PM Vivek Das <vd4mm...@gmail.com> wrote: >> > >> >> Hi Bogdan, >> >> >> >> You can also get the information from the link of the Wilcox.test >> function >> >> page. >> >> >> >> “By default (if exact is not specified), an exact p-value is computed >> if >> >> the samples contain less than 50 finite values and there are no ties. >> >> Otherwise, a normal approximation is used.” >> >> >> >> For more: >> >> >> >> >> https://stat.ethz.ch/R-manual/R-devel/library/stats/html/wilcox.test.html >> >> >> >> Hope this helps! >> >> >> >> Best, >> >> >> >> VD >> >> >> >> >> >> On Thu, Mar 18, 2021 at 10:36 PM Bogdan Tanasa <tan...@gmail.com> >> wrote: >> >> >> >>> Dear Peter, thanks a lot. yes, we can see a very precise p-value, and >> that >> >>> was the request from the journal. >> >>> >> >>> if I may ask another question please : what is the meaning of >> "exact=TRUE" >> >>> or "exact=FALSE" in wilcox.test ? >> >>> >> >>> i can see that the "numerically precise" p-values are different. >> thanks a >> >>> lot ! >> >>> >> >>> tst = wilcox.test(rnorm(100), rnorm(100, 2), exact=TRUE) >> >>> tst$p.value >> >>> [1] 8.535524e-25 >> >>> >> >>> tst = wilcox.test(rnorm(100), rnorm(100, 2), exact=FALSE) >> >>> tst$p.value >> >>> [1] 3.448211e-25 >> >>> >> >>> On Thu, Mar 18, 2021 at 10:15 PM Peter Langfelder < >> >>> peter.langfel...@gmail.com> wrote: >> >>> >> >>>> I thinnk the answer is much simpler. The print method for hypothesis >> >>>> tests (class htest) truncates the p-values. In the above example, >> >>>> instead of using >> >>>> >> >>>> wilcox.test(rnorm(100), rnorm(100, 2), exact=TRUE) >> >>>> >> >>>> and copying the output, just print the p-value: >> >>>> >> >>>> tst = wilcox.test(rnorm(100), rnorm(100, 2), exact=TRUE) >> >>>> tst$p.value >> >>>> >> >>>> [1] 2.988368e-32 >> >>>> >> >>>> >> >>>> I think this value is what the journal asks for. >> >>>> >> >>>> HTH, >> >>>> >> >>>> Peter >> >>>> >> >>>> On Thu, Mar 18, 2021 at 10:05 PM Spencer Graves >> >>>> <spencer.gra...@effectivedefense.org> wrote: >> >>>>> I would push back on that from two perspectives: >> >>>>> >> >>>>> >> >>>>> 1. I would study exactly what the journal said very >> >>>>> carefully. If they mandated "wilcox.test", that function has an >> >>>>> argument called "exact". If that's what they are asking, then using >> >>>>> that argument gives the exact p-value, e.g.: >> >>>>> >> >>>>> >> >>>>> > wilcox.test(rnorm(100), rnorm(100, 2), exact=TRUE) >> >>>>> >> >>>>> Wilcoxon rank sum exact test >> >>>>> >> >>>>> data: rnorm(100) and rnorm(100, 2) >> >>>>> W = 691, p-value < 2.2e-16 >> >>>>> >> >>>>> >> >>>>> 2. If that's NOT what they are asking, then I'm not >> >>>>> convinced what they are asking makes sense: There is is no such >> thing >> >>>>> as an "exact p value" except to the extent that certain assumptions >> >>>>> hold, and all models are wrong (but some are useful), as George Box >> >>>>> famously said years ago.[1] Truth only exists in mathematics, and >> >>>>> that's because it's a fiction to start with ;-) >> >>>>> >> >>>>> >> >>>>> Hope this helps. >> >>>>> Spencer Graves >> >>>>> >> >>>>> >> >>>>> [1] >> >>>>> https://en.wikipedia.org/wiki/All_models_are_wrong >> >>>>> >> >>>>> >> >>>>> On 2021-3-18 11:12 PM, Bogdan Tanasa wrote: >> >>>>>> < >> >>>> >> https://meta.stackexchange.com/questions/362285/about-a-p-value-2-2e-16 >> >>>> >> >>>>>> Dear all, >> >>>>>> >> >>>>>> i would appreciate having your advice on the following please : >> >>>>>> >> >>>>>> in R, the wilcox.test() provides "a p-value < 2.2e-16", when we >> >>> compare >> >>>>>> sets of 1000 genes expression (in the genomics field). >> >>>>>> >> >>>>>> however, the journal asks us to provide the exact p value ... >> >>>>>> >> >>>>>> would it be legitimate to write : "p-value = 0" ? thanks a lot, >> >>>>>> >> >>>>>> -- bogdan >> >>>>>> >> >>>>>> [[alternative HTML version deleted]] >> >>>>>> >> >>>>>> ______________________________________________ >> >>>>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> >>>>>> https://stat.ethz.ch/mailman/listinfo/r-help >> >>>>>> PLEASE do read the posting guide >> >>>> http://www.R-project.org/posting-guide.html >> >>>>>> and provide commented, minimal, self-contained, reproducible code. >> >>>>> ______________________________________________ >> >>>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> >>>>> https://stat.ethz.ch/mailman/listinfo/r-help >> >>>>> PLEASE do read the posting guide >> >>>> http://www.R-project.org/posting-guide.html >> >>>>> and provide commented, minimal, self-contained, reproducible code. >> >>> [[alternative HTML version deleted]] >> >>> >> >>> ______________________________________________ >> >>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> >>> https://stat.ethz.ch/mailman/listinfo/r-help >> >>> PLEASE do read the posting guide >> >>> http://www.R-project.org/posting-guide.html >> >>> and provide commented, minimal, self-contained, reproducible code. >> >>> >> >> -- >> >> ---------------------------------------------------------- >> >> >> >> Vivek Das, PhD >> >> >> > [[alternative HTML version deleted]] >> > >> > ______________________________________________ >> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> > https://stat.ethz.ch/mailman/listinfo/r-help >> > PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> > and provide commented, minimal, self-contained, reproducible code. >> >> >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.