Dear Jiefei, and all, many thanks for your time and comments, suggestions, insights.
-- bogdan On Fri, Mar 19, 2021 at 7:52 AM Jiefei Wang <szwj...@gmail.com> wrote: > After digging into the R source, it turns out that the argument `exact` > has nothing to do with the numeric precision. It only affects the statistic > model used to compute the p-value. When `exact=TRUE` the true distribution > of the statistic will be used. Otherwise, a normal approximation will be > used. > > I think the documentation needs to be improved here, you can compute the > exact p-value *only* when you do not have any ties in your data. If you > have ties in your data you will get the p-value from the normal > approximation no matter what value you put in `exact`. This behavior should > be documented or a warning should be given when `exact=TRUE` and ties > present. > > FYI, if the exact p-value is required, `pwilcox` function will be used to > compute the p-value. There are no details on how it computes the pvalue but > its C code seems to compute the probability table, so I assume it computes > the exact p-value from the true distribution of the statistic, not a > permutation or MC p-value. > > Best, > Jiefei > > > > On Fri, Mar 19, 2021 at 10:01 PM Jiefei Wang <szwj...@gmail.com> wrote: > >> Hey, >> >> I just want to point out that the word "exact" has two meanings. It can >> mean the numerically accurate p-value as Bogdan asked in his first email, >> or it could mean the p-value calculated from the exact distribution of the >> statistic(In this case, U stat). These two are actually not related, even >> though they all called "exact". >> >> Best, >> Jiefei >> >> On Fri, Mar 19, 2021 at 9:31 PM Spencer Graves < >> spencer.gra...@effectivedefense.org> wrote: >> >>> >>> >>> On 2021-3-19 12:54 AM, Bogdan Tanasa wrote: >>> > thanks a lot, Vivek ! in other words, assuming that we work with 1000 >>> data >>> > points, >>> > >>> > shall we use EXACT = TRUE, it uses the normal approximation, >>> > >>> > while if EXACT=FALSE (for these large samples), it does not ? >>> >>> >>> As David Winsemius noted, the documentation is not clear. >>> Consider the following: >>> >>> > set.seed(1) > x <- rnorm(100) > y <- rnorm(100, 2) > > wilcox.test(x, >>> y)$p.value >>> [1] 1.172189e-25 > wilcox.test(x, y)$p.value [1] 1.172189e-25 > > >>> wilcox.test(x, y, EXACT=TRUE)$p.value [1] 1.172189e-25 > wilcox.test(x, >>> y, EXACT=TRUE)$p.value [1] 1.172189e-25 > wilcox.test(x, y, >>> exact=TRUE)$p.value [1] 4.123875e-32 > wilcox.test(x, y, >>> exact=TRUE)$p.value [1] 4.123875e-32 > > wilcox.test(x, y, >>> EXACT=FALSE)$p.value [1] 1.172189e-25 > wilcox.test(x, y, >>> EXACT=FALSE)$p.value [1] 1.172189e-25 > wilcox.test(x, y, >>> exact=FALSE)$p.value [1] 1.172189e-25 > wilcox.test(x, y, >>> exact=FALSE)$p.value [1] 1.172189e-25 > We get two values here: >>> 1.172189e-25 and 4.123875e-32. The first one, I think, is the normal >>> approximation, which is the same as exact=FALSE. I think that with >>> exact=FALSE, you get a permutation distribution, though I'm not sure. >>> You might try looking at "wilcox_test in package coin for exact, >>> asymptotic and Monte Carlo conditional p-values, including in the >>> presence of ties" to see if it is clearer. NOTE: R is case sensitive, so >>> "EXACT" is a different variable from "exact". It is interpreted as an >>> optional argument, which is not recognized and therefore ignored in this >>> context. >>> Hope this helps. >>> Spencer >>> >>> >>> > On Thu, Mar 18, 2021 at 10:47 PM Vivek Das <vd4mm...@gmail.com> wrote: >>> > >>> >> Hi Bogdan, >>> >> >>> >> You can also get the information from the link of the Wilcox.test >>> function >>> >> page. >>> >> >>> >> “By default (if exact is not specified), an exact p-value is computed >>> if >>> >> the samples contain less than 50 finite values and there are no ties. >>> >> Otherwise, a normal approximation is used.” >>> >> >>> >> For more: >>> >> >>> >> >>> https://stat.ethz.ch/R-manual/R-devel/library/stats/html/wilcox.test.html >>> >> >>> >> Hope this helps! >>> >> >>> >> Best, >>> >> >>> >> VD >>> >> >>> >> >>> >> On Thu, Mar 18, 2021 at 10:36 PM Bogdan Tanasa <tan...@gmail.com> >>> wrote: >>> >> >>> >>> Dear Peter, thanks a lot. yes, we can see a very precise p-value, >>> and that >>> >>> was the request from the journal. >>> >>> >>> >>> if I may ask another question please : what is the meaning of >>> "exact=TRUE" >>> >>> or "exact=FALSE" in wilcox.test ? >>> >>> >>> >>> i can see that the "numerically precise" p-values are different. >>> thanks a >>> >>> lot ! >>> >>> >>> >>> tst = wilcox.test(rnorm(100), rnorm(100, 2), exact=TRUE) >>> >>> tst$p.value >>> >>> [1] 8.535524e-25 >>> >>> >>> >>> tst = wilcox.test(rnorm(100), rnorm(100, 2), exact=FALSE) >>> >>> tst$p.value >>> >>> [1] 3.448211e-25 >>> >>> >>> >>> On Thu, Mar 18, 2021 at 10:15 PM Peter Langfelder < >>> >>> peter.langfel...@gmail.com> wrote: >>> >>> >>> >>>> I thinnk the answer is much simpler. The print method for hypothesis >>> >>>> tests (class htest) truncates the p-values. In the above example, >>> >>>> instead of using >>> >>>> >>> >>>> wilcox.test(rnorm(100), rnorm(100, 2), exact=TRUE) >>> >>>> >>> >>>> and copying the output, just print the p-value: >>> >>>> >>> >>>> tst = wilcox.test(rnorm(100), rnorm(100, 2), exact=TRUE) >>> >>>> tst$p.value >>> >>>> >>> >>>> [1] 2.988368e-32 >>> >>>> >>> >>>> >>> >>>> I think this value is what the journal asks for. >>> >>>> >>> >>>> HTH, >>> >>>> >>> >>>> Peter >>> >>>> >>> >>>> On Thu, Mar 18, 2021 at 10:05 PM Spencer Graves >>> >>>> <spencer.gra...@effectivedefense.org> wrote: >>> >>>>> I would push back on that from two perspectives: >>> >>>>> >>> >>>>> >>> >>>>> 1. I would study exactly what the journal said very >>> >>>>> carefully. If they mandated "wilcox.test", that function has an >>> >>>>> argument called "exact". If that's what they are asking, then >>> using >>> >>>>> that argument gives the exact p-value, e.g.: >>> >>>>> >>> >>>>> >>> >>>>> > wilcox.test(rnorm(100), rnorm(100, 2), exact=TRUE) >>> >>>>> >>> >>>>> Wilcoxon rank sum exact test >>> >>>>> >>> >>>>> data: rnorm(100) and rnorm(100, 2) >>> >>>>> W = 691, p-value < 2.2e-16 >>> >>>>> >>> >>>>> >>> >>>>> 2. If that's NOT what they are asking, then I'm not >>> >>>>> convinced what they are asking makes sense: There is is no such >>> thing >>> >>>>> as an "exact p value" except to the extent that certain assumptions >>> >>>>> hold, and all models are wrong (but some are useful), as George Box >>> >>>>> famously said years ago.[1] Truth only exists in mathematics, and >>> >>>>> that's because it's a fiction to start with ;-) >>> >>>>> >>> >>>>> >>> >>>>> Hope this helps. >>> >>>>> Spencer Graves >>> >>>>> >>> >>>>> >>> >>>>> [1] >>> >>>>> https://en.wikipedia.org/wiki/All_models_are_wrong >>> >>>>> >>> >>>>> >>> >>>>> On 2021-3-18 11:12 PM, Bogdan Tanasa wrote: >>> >>>>>> < >>> >>>> >>> https://meta.stackexchange.com/questions/362285/about-a-p-value-2-2e-16 >>> >>>> >>> >>>>>> Dear all, >>> >>>>>> >>> >>>>>> i would appreciate having your advice on the following please : >>> >>>>>> >>> >>>>>> in R, the wilcox.test() provides "a p-value < 2.2e-16", when we >>> >>> compare >>> >>>>>> sets of 1000 genes expression (in the genomics field). >>> >>>>>> >>> >>>>>> however, the journal asks us to provide the exact p value ... >>> >>>>>> >>> >>>>>> would it be legitimate to write : "p-value = 0" ? thanks a lot, >>> >>>>>> >>> >>>>>> -- bogdan >>> >>>>>> >>> >>>>>> [[alternative HTML version deleted]] >>> >>>>>> >>> >>>>>> ______________________________________________ >>> >>>>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >>> >>>>>> https://stat.ethz.ch/mailman/listinfo/r-help >>> >>>>>> PLEASE do read the posting guide >>> >>>> http://www.R-project.org/posting-guide.html >>> >>>>>> and provide commented, minimal, self-contained, reproducible code. >>> >>>>> ______________________________________________ >>> >>>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >>> >>>>> https://stat.ethz.ch/mailman/listinfo/r-help >>> >>>>> PLEASE do read the posting guide >>> >>>> http://www.R-project.org/posting-guide.html >>> >>>>> and provide commented, minimal, self-contained, reproducible code. >>> >>> [[alternative HTML version deleted]] >>> >>> >>> >>> ______________________________________________ >>> >>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >>> >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> >>> PLEASE do read the posting guide >>> >>> http://www.R-project.org/posting-guide.html >>> >>> and provide commented, minimal, self-contained, reproducible code. >>> >>> >>> >> -- >>> >> ---------------------------------------------------------- >>> >> >>> >> Vivek Das, PhD >>> >> >>> > [[alternative HTML version deleted]] >>> > >>> > ______________________________________________ >>> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >>> > https://stat.ethz.ch/mailman/listinfo/r-help >>> > PLEASE do read the posting guide >>> http://www.R-project.org/posting-guide.html >>> > and provide commented, minimal, self-contained, reproducible code. >>> >>> >>> [[alternative HTML version deleted]] >>> >>> ______________________________________________ >>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide >>> http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >>> >> [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.