On Aug 18, 2010, at 11:55 AM, Cedric Laczny wrote:

> I was able to trace down the unexpected behavior to the following line
> SIGMA <- sqrt((n.x * n.y/12) * ((n.x + n.y + 1) - 
>                sum(NTIES^3 - NTIES)/((n.x + n.y) * (n.x + n.y - 
>                  1))))
> My calculations of the Z-score for the normal approximation where based on 
> using the standard deviation for ranks _without_ ties. The above formula 
> seems 
> to account for ties and thus, yields a slightly different z-score. However, 
> the 
> data seems to include at most 1 tie (based on rnorm), so it would be the same 
> result as if it contained no tie (1^3 - 1 has the same result as 0^3 - 0, 
> obviously ;) ) and thus I would expect the result to be the same as when 
> using 
> the formula for the standard deviation without ties.

Note the definition of NTIES <- table(r), counting the number of observations 
tied for a particular rank, so it is all ones if and only if there are NO ties 
in data. 

(If you are in paper-and-pencil mode, these formulas are fairly easily worked 
out once you realize that you only need the mean and variance of the rank of a 
single observation -- the covariances are C(R1,R2) = -1/(N-1) V(R1) because of 
symmetry and the fact that the sum of all N ranks is fixed.)

-- 
Peter Dalgaard
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd....@cbs.dk  Priv: pda...@gmail.com

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to