On 2021-3-19 9:52 AM, Jiefei Wang wrote:
After digging into the R source, it turns out that the argument `exact` has
nothing to do with the numeric precision. It only affects the statistic
model used to compute the p-value. When `exact=TRUE` the true distribution
of the statistic will be used. Otherwise, a normal approximation will be
used.

I think the documentation needs to be improved here, you can compute the
exact p-value *only* when you do not have any ties in your data. If you
have ties in your data you will get the p-value from the normal
approximation no matter what value you put in `exact`. This behavior should
be documented or a warning should be given when `exact=TRUE` and ties
present.

FYI, if the exact p-value is required, `pwilcox` function will be used to
compute the p-value. There are no details on how it computes the pvalue but
its C code seems to compute the probability table, so I assume it computes
the exact p-value from the true distribution of the statistic, not a
permutation or MC p-value.


      My example shows that it does NOT use Monte Carlo, because otherwise it uses some distribution.  I believe the term "exact" means that it uses the permutation distribution, though I could be mistaken.  If it's NOT a permutation distribution, I don't know what it is.


      Spencer

Best,
Jiefei



On Fri, Mar 19, 2021 at 10:01 PM Jiefei Wang <szwj...@gmail.com> wrote:

Hey,

I just want to point out that the word "exact" has two meanings. It can
mean the numerically accurate p-value as Bogdan asked in his first email,
or it could mean the p-value calculated from the exact distribution of the
statistic(In this case, U stat). These two are actually not related, even
though they all called "exact".

Best,
Jiefei

On Fri, Mar 19, 2021 at 9:31 PM Spencer Graves <
spencer.gra...@effectivedefense.org> wrote:


On 2021-3-19 12:54 AM, Bogdan Tanasa wrote:
thanks a lot, Vivek ! in other words, assuming that we work with 1000
data
points,

shall we use EXACT = TRUE, it uses the normal approximation,

while if EXACT=FALSE (for these large samples), it does not ?

        As David Winsemius noted, the documentation is not clear.
Consider the following:

set.seed(1)  > x <- rnorm(100) > y <- rnorm(100, 2) > > wilcox.test(x,
y)$p.value
[1] 1.172189e-25 > wilcox.test(x, y)$p.value [1] 1.172189e-25 > >
wilcox.test(x, y, EXACT=TRUE)$p.value [1] 1.172189e-25 > wilcox.test(x,
y, EXACT=TRUE)$p.value [1] 1.172189e-25 > wilcox.test(x, y,
exact=TRUE)$p.value [1] 4.123875e-32 > wilcox.test(x, y,
exact=TRUE)$p.value [1] 4.123875e-32 > > wilcox.test(x, y,
EXACT=FALSE)$p.value [1] 1.172189e-25 > wilcox.test(x, y,
EXACT=FALSE)$p.value [1] 1.172189e-25 > wilcox.test(x, y,
exact=FALSE)$p.value [1] 1.172189e-25 > wilcox.test(x, y,
exact=FALSE)$p.value [1] 1.172189e-25 > We get two values here:
1.172189e-25 and 4.123875e-32. The first one, I think, is the normal
approximation, which is the same as exact=FALSE. I think that with
exact=FALSE, you get a permutation distribution, though I'm not sure.
You might try looking at "wilcox_test in package coin for exact,
asymptotic and Monte Carlo conditional p-values, including in the
presence of ties" to see if it is clearer. NOTE: R is case sensitive, so
"EXACT" is a different variable from "exact". It is interpreted as an
optional argument, which is not recognized and therefore ignored in this
context.
           Hope this helps.
           Spencer


On Thu, Mar 18, 2021 at 10:47 PM Vivek Das <vd4mm...@gmail.com> wrote:

Hi Bogdan,

You can also get the information from the link of the Wilcox.test
function
page.

“By default (if exact is not specified), an exact p-value is computed
if
the samples contain less than 50 finite values and there are no ties.
Otherwise, a normal approximation is used.”

For more:


https://stat.ethz.ch/R-manual/R-devel/library/stats/html/wilcox.test.html
Hope this helps!

Best,

VD


On Thu, Mar 18, 2021 at 10:36 PM Bogdan Tanasa <tan...@gmail.com>
wrote:
Dear Peter, thanks a lot. yes, we can see a very precise p-value, and
that
was the request from the journal.

if I may ask another question please : what is the meaning of
"exact=TRUE"
or "exact=FALSE" in wilcox.test ?

i can see that the "numerically precise" p-values are different.
thanks a
lot !

tst = wilcox.test(rnorm(100), rnorm(100, 2), exact=TRUE)
tst$p.value
[1] 8.535524e-25

tst = wilcox.test(rnorm(100), rnorm(100, 2), exact=FALSE)
tst$p.value
[1] 3.448211e-25

On Thu, Mar 18, 2021 at 10:15 PM Peter Langfelder <
peter.langfel...@gmail.com> wrote:

I thinnk the answer is much simpler. The print method for hypothesis
tests (class htest) truncates the p-values. In the above example,
instead of using

wilcox.test(rnorm(100), rnorm(100, 2), exact=TRUE)

and copying the output, just print the p-value:

tst = wilcox.test(rnorm(100), rnorm(100, 2), exact=TRUE)
tst$p.value

[1] 2.988368e-32


I think this value is what the journal asks for.

HTH,

Peter

On Thu, Mar 18, 2021 at 10:05 PM Spencer Graves
<spencer.gra...@effectivedefense.org> wrote:
         I would push back on that from two perspectives:


               1.  I would study exactly what the journal said very
carefully.  If they mandated "wilcox.test", that function has an
argument called "exact".  If that's what they are asking, then using
that argument gives the exact p-value, e.g.:


   > wilcox.test(rnorm(100), rnorm(100, 2), exact=TRUE)

           Wilcoxon rank sum exact test

data:  rnorm(100) and rnorm(100, 2)
W = 691, p-value < 2.2e-16


               2.  If that's NOT what they are asking, then I'm not
convinced what they are asking makes sense:  There is is no such
thing
as an "exact p value" except to the extent that certain assumptions
hold, and all models are wrong (but some are useful), as George Box
famously said years ago.[1]  Truth only exists in mathematics, and
that's because it's a fiction to start with ;-)


         Hope this helps.
         Spencer Graves


[1]
https://en.wikipedia.org/wiki/All_models_are_wrong


On 2021-3-18 11:12 PM, Bogdan Tanasa wrote:
    <
https://meta.stackexchange.com/questions/362285/about-a-p-value-2-2e-16
Dear all,

i would appreciate having your advice on the following please :

in R, the wilcox.test() provides "a p-value < 2.2e-16", when we
compare
sets of 1000 genes expression (in the genomics field).

however, the journal asks us to provide the exact p value ...

would it be legitimate to write : "p-value = 0" ? thanks a lot,

-- bogdan

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
          [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

--
----------------------------------------------------------

Vivek Das, PhD

       [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

         [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to