On Aug 18, 2010, at 11:55 AM, Cedric Laczny wrote: > I was able to trace down the unexpected behavior to the following line > SIGMA <- sqrt((n.x * n.y/12) * ((n.x + n.y + 1) - > sum(NTIES^3 - NTIES)/((n.x + n.y) * (n.x + n.y - > 1)))) > My calculations of the Z-score for the normal approximation where based on > using the standard deviation for ranks _without_ ties. The above formula > seems > to account for ties and thus, yields a slightly different z-score. However, > the > data seems to include at most 1 tie (based on rnorm), so it would be the same > result as if it contained no tie (1^3 - 1 has the same result as 0^3 - 0, > obviously ;) ) and thus I would expect the result to be the same as when > using > the formula for the standard deviation without ties.
Note the definition of NTIES <- table(r), counting the number of observations tied for a particular rank, so it is all ones if and only if there are NO ties in data. (If you are in paper-and-pencil mode, these formulas are fairly easily worked out once you realize that you only need the mean and variance of the rank of a single observation -- the covariances are C(R1,R2) = -1/(N-1) V(R1) because of symmetry and the fact that the sum of all N ranks is fixed.) -- Peter Dalgaard Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd....@cbs.dk Priv: pda...@gmail.com ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.