I've fixed this by adding 0.5/mn to q. The problem (at least in principle) with multiplying them all up is integer overflow.
By the time 0.5/mn underflows to zero, missing one value in the distribution won't matter. -thomas On Fri, 18 Dec 2009, David John Allwright wrote: > Dear Thomas, Right, thank you. Yes, I haven't looked at the source code > (because I don't know C) but something like what you mention could well cause > the kind of problems I am seeing: a loop being exectued one too few or one > too many times. And yes, I think those quantities should be multiplied up by > m*n to all become integers so we escape rounding error problems. David. > ------------------------------------------------------------------------------ > On Wed, 16 Dec 2009, tlum...@u.washington.edu wrote: > >> On Tue, 15 Dec 2009, allwr...@maths.ox.ac.uk wrote; (in part) >> >>> >>> x<-1:5 >>> y<-c(2.5,4.5) >>> ks.test(x,y) >>> >>> The value of the D_2,5 statistic is calculated as 0.4 correctly, but the >>> p-value is stated by R as 1, though in fact it should be 20/21=0.9524 >> >> >> What we seem to have here is a rounding error problem. >> >> In ks.c:psmirnov2x, there is a double loop including >> if(fabs(i / md - j / nd) > q) >> u[j] = 0; >> >> where md=2, nd=5, and q=3/10. >> >> Now, to full precision abs(1/2 - 4/5) > 3/10 is false, but at least on my >> MacBook it is true in C double precision. >> >> I'm not sure why the loop is working with doubles, since multiplying by m*n >> should make everything an integer. >> >> -thomas >> >> Thomas Lumley Assoc. Professor, Biostatistics >> tlum...@u.washington.edu University of Washington, Seattle >> >> >> > Thomas Lumley Assoc. Professor, Biostatistics tlum...@u.washington.edu University of Washington, Seattle ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel