robert wrote: > > t = r * sqrt( (n-2)/(1-r**2) )
> yet too lazy/practical for digging these things from there. You obviously got > it - out of that, what would be a final estimate for an error range of r (n > big) ? > that same "const. * (1-r**2)/sqrt(n)" which I found in that other document ? I gave you th formula. Solve for r and you get the confidence interval. You will need to use the inverse cumulative Student t distribution. Another quick-and-dirty solution is to use bootstrapping. from numpy import mean, std, sum, sqrt, sort from numpy.random import randint def bootstrap_correlation(x,y): idx = randint(len(x),size=(1000,len(x))) bx = x[idx] # reasmples x with replacement by = y[idx] # resamples y with replacement mx = mean(bx,1) my = mean(by,1) sx = std(bx,1) sy = std(by,1) r = sort(sum( (bx - mx.repeat(len(x),0).reshape(bx.shape)) * (by - my.repeat(len(y),0).reshape(by.shape)), 1) / ((len(x)-1)*sx*sy)) #bootstrap confidence interval (NB! biased) return (r[25],r[975]) > My main concern is, how to respect the fact, that the (x,y) points may not > distribute well along the regression line. The bootstrap is "non-parametric" in the sense that it is distribution free. -- http://mail.python.org/mailman/listinfo/python-list