I have been examining the Mann-Whitney test closely. And there are two features of the R implementation that puzzles me. The test statistic is reported as "W" and depends on the order of the arguments to the function.
> x <- c(1,3,5,7,9) > y <- x-1 > x [1] 1 3 5 7 9 > y [1] 0 2 4 6 8 > wilcox.test(x, y)$statistic W 15 > wilcox.test(y,x)$statistic W 10 Delving into the implementation of the test 15 and 10 are the "U" statistics calculated for x and y respectively. All I have read about Mann-Whitney chooses one or the other to report. (The biggest or the smallest). Why is only one being reported? Also I have two implementations of the algorithm. One at http://en.wikipedia.org/wiki/Mann-Whitney_U_test calculates U then generates a normal random variable as (U-Mu)/S and one from http://faculty.vassar.edu/lowry/webtext.html that does not use U instead uses the ranks directly. The second algorithm agrees with the output of wilcox.test in R. Why calculate U, or W, at all? cheers Worik [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.