So I tried adding Infinity support for all cases.
And it is (as could be expected) more complicated than I thought.

It is easy to add Inf support for the test. The problems start with 
conf.int=TRUE.

Currently confidence intervals are computed via `uniroot()` and, in the case of infinities, we are computationally looking for roots over infinite interval which results in an error. I suspect this is the reason Inf values were removed in the first place.

Just a note, I found a few more errors/inconsistencies when requesting confidence intervals with paired=TRUE (due to Infinities being left in).

Current error in Inf-Inf scenario:

wilcox.test(c(1,2,Inf), c(4,8,Inf), paired=TRUE, conf.int=TRUE)
  Error in if (ZEROES) x <- x[x != 0] :
    missing value where TRUE/FALSE needed



NaN confidence intervals:

wilcox.test(c(1:9,Inf), c(21:28,Inf,30), paired=TRUE, conf.int=TRUE)

  Wilcoxon signed rank test with continuity correction

  data:  c(1:9, Inf) and c(21:28, Inf, 30)
  V = 9.5, p-value = 0.0586
  alternative hypothesis: true location shift is not equal to 0
  0 percent confidence interval:
   NaN NaN
   sample estimates:
   midrange
        NaN


The easiest "fix" for consistency would be to simply remove Infinity support from the paired=TRUE case.

But going with the more desirable approach of adding Infinity support for non-paired cases - it is currently not clear to me what confidence intervals and pseudomedian should be. Specially when Infinities are on both sides.

Regards,
Karolis Koncevičius.


On 2019-12-07 23:18, Karolis Koncevičius wrote:
Thank you for a fast response. Nice to see this mailing list being so alive.

Regarding Inf issue: I agree with your assessment that Inf should not be removed. The code gave me an impression that Inf values were intentionally removed (since is.finite() was used everywhere, except for paired case). I will try to adjust my patch according to your feedback.

One more thing: it seems like you assumed that issues 2:4 are all related to machine precision, which is not the case - only 2nd issue is. Just wanted to draw this to your attention, in case you might have some feedback and guidelines about issues 3 and 4 as well.



On 2019-12-07 21:59, Martin Maechler wrote:
Karolis Koncevičius
   on Sat, 7 Dec 2019 20:55:36 +0200 writes:

  > Hello,
  > Writing to share some things I've found about wilcox.test() that seem a
  > a bit inconsistent.

  > 1. Inf values are not removed if paired=TRUE

  > # returns different results (Inf is removed):
  > wilcox.test(c(1,2,3,4), c(0,9,8,7))
  > wilcox.test(c(1,2,3,4), c(0,9,8,Inf))

  > # returns the same result (Inf is left as value with highest rank):
  > wilcox.test(c(1,2,3,4), c(0,9,8,7), paired=TRUE)
  > wilcox.test(c(1,2,3,4), c(0,9,8,Inf), paired=TRUE)

Now which of the two cases do you consider correct ?

IHMO, the 2nd one is correct: it is exactly one property of the
*robustness* of the wilcoxon test and very desirable that any
(positive) outlier is treated the same as just "the largest
value" and the test statistic (and hence the p-value) should not
change.

So I think the first case is wrong, notably if modified, (such
that the last  y  is the overall maximal value (slightly larger sample):

wilcox.test(1:7, 1/8+ c(9:4, 12))

        Wilcoxon rank sum test

data:  1:7 and 1/8 + c(9:4, 12)
W = 6, p-value = 0.01748
alternative hypothesis: true location shift is not equal to 0

wilcox.test(1:7, 1/8+ c(9:4, 10000))

        Wilcoxon rank sum test

data:  1:7 and 1/8 + c(9:4, 10000)
W = 6, p-value = 0.01748
alternative hypothesis: true location shift is not equal to 0

wilcox.test(1:7, 1/8+ c(9:4, Inf))

        Wilcoxon rank sum test

data:  1:7 and 1/8 + c(9:4, Inf)
W = 6, p-value = 0.03497
alternative hypothesis: true location shift is not equal to 0

The  Inf  case should definitely give the same as the  10'000 case.
That's exactly one property of a robust statistic.

Thank you, Karolis, this is pretty embarrassing to only be detected now after 
25+ years of R in use ...

The correct fix starts with replacing the  is.finite()  by   !is.na() and keep 
the 'Inf' in the rank computations...
(but then probably also deal with the case of more than one Inf, notably the  Inf - Inf  
"exception" which is not triggered by your example...)


---

Ben addressed the "rounding" / numerical issues unavoidable for
the other problems.

  > 2. tolerance issues with paired=TRUE.

  > wilcox.test(c(4, 3, 2), c(3, 2, 1), paired=TRUE)
  > # ...
  > # Warning:  cannot compute exact p-value with ties

  > wilcox.test(c(0.4,0.3,0.2), c(0.3,0.2,0.1), paired=TRUE)
  > # ...
  > # no warning

  > 3. Always 'x observations are missing' when paired=TRUE

  > wilcox.test(c(1,2), c(NA_integer_,NA_integer_), paired=TRUE)
  > # ...
  > # Error:  not enough (finite) 'x' observations

  > 4. No indication if normal approximation was used:

  > # different numbers, but same "method" name
  > wilcox.test(rnorm(10), exact=FALSE, correct=FALSE)
  > wilcox.test(rnorm(10), exact=TRUE, correct=FALSE)


  > From all of these I am pretty sure the 1st one is likely unintended,
  > so attaching a small patch to adjust it. Can also try patching others if
  > consensus is reached that the behavioiur has to be modified.

  > Kind regards,
  > Karolis Koncevičius.

  > ---

  > Index: wilcox.test.R
  > ===================================================================
  > --- wilcox.test.R  (revision 77540)
  > +++ wilcox.test.R  (working copy)
  > @@ -42,7 +42,7 @@
  > if(paired) {
  > if(length(x) != length(y))
  > stop("'x' and 'y' must have the same length")
  > -            OK <- complete.cases(x, y)
  > +            OK <- is.finite(x) & is.finite(y)
  > x <- x[OK] - y[OK]
  > y <- NULL
  > }

  > ______________________________________________
  > R-devel@r-project.org mailing list
  > https://stat.ethz.ch/mailman/listinfo/r-devel

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Reply via email to