Re: numpy/scipy: correlation

2006-11-12 Thread sturlamolden
While I am at it, lets add the bootstrap estimate of the standard error as well. from numpy import mean, std, sum, sqrt, sort, corrcoef, tanh, arctanh from numpy.random import randint def bootstrap_correlation(x,y): idx = randint(len(x),size=(1000,len(x))) bx = x[idx] by = y[idx]

Re: numpy/scipy: correlation

2006-11-12 Thread sturlamolden
robert wrote: > > t = r * sqrt( (n-2)/(1-r**2) ) > yet too lazy/practical for digging these things from there. You obviously got > it - out of that, what would be a final estimate for an error range of r (n > big) ? > that same "const. * (1-r**2)/sqrt(n)" which I found in that other document ?

Re: numpy/scipy: correlation

2006-11-12 Thread Robert Kern
robert wrote: > I remember once I saw somewhere a formula for an error range of the corrcoef. > but cannot find it anymore. There is no such thing as "a formula for an error range" in a vacuum like that. Each formula has a model attached to it. If your data does not follow that model, then any

Re: numpy/scipy: correlation

2006-11-12 Thread Robert Kern
sturlamolden wrote: > Robert Kern wrote: > >> The difference between the two models is that the first places no >> restrictions >> on the distribution of x. The second does; both the x and y marginal >> distributions need to be normal. Under the first model, the correlation >> coefficient has no

Re: numpy/scipy: correlation

2006-11-12 Thread Ramon Diaz-Uriarte
On 11/12/06, robert <[EMAIL PROTECTED]> wrote: > Robert Kern wrote: > > robert wrote: (...) > One would expect the error range to drop simply with # of points. Yet it > depends more complexly on the mean value of the coef and on the distribution > at all. > More interesting realworld cases: For

Re: numpy/scipy: correlation

2006-11-12 Thread robert
sturlamolden wrote: > First, are you talking about rounding error (due to floating point > arithmetics) or statistical sampling error? About measured data. rounding and sampling errors with special distrutions are neglegible. Thus by default assuming gaussian noise in x and y. (This may explain

Re: numpy/scipy: correlation

2006-11-12 Thread sturlamolden
First, are you talking about rounding error (due to floating point arithmetics) or statistical sampling error? If you are talking about the latter, I suggest you look it up in a statistics text book. E.g. if x and y are normally distributed, then t = r * sqrt( (n-2)/(1-r**2) ) has a Student t-d

Re: numpy/scipy: correlation

2006-11-12 Thread robert
robert wrote: > Robert Kern wrote: > http://links.jstor.org/sici?sici=0162-1459(192906)24%3A166%3C170%3AFFPEOC%3E2.0.CO%3B2-Y > > > tells: > probable error of r = 0.6745*(1-r**2)/sqrt(N) > > A simple function of r and N - quite what I expected above roughly for > the N-only dep.. But thus it

Re: numpy/scipy: correlation

2006-11-12 Thread sturlamolden
Robert Kern wrote: > The difference between the two models is that the first places no restrictions > on the distribution of x. The second does; both the x and y marginal > distributions need to be normal. Under the first model, the correlation > coefficient has no meaning. That is not correct.

Re: numpy/scipy: correlation

2006-11-12 Thread robert
Robert Kern wrote: > robert wrote: >> Is there a ready made function in numpy/scipy to compute the correlation >> y=mx+o of an X and Y fast: >> m, m-err, o, o-err, r-coef,r-coef-err ? > > And of course, those three parameters are not particularly meaningful > together. > If your model is truly

Re: numpy/scipy: correlation

2006-11-11 Thread Robert Kern
robert wrote: > Is there a ready made function in numpy/scipy to compute the correlation > y=mx+o of an X and Y fast: > m, m-err, o, o-err, r-coef,r-coef-err ? And of course, those three parameters are not particularly meaningful together. If your model is truly "y is a linear response given x w

Re: numpy/scipy: correlation

2006-11-11 Thread Robert Kern
Robert Kern wrote: > robert wrote: >> Is there a ready made function in numpy/scipy to compute the correlation >> y=mx+o of an X and Y fast: >> m, m-err, o, o-err, r-coef,r-coef-err ? > scipy.optimize.leastsq() can be told to return the covariance matrix of the > estimated parameters (m and o in

Re: numpy/scipy: correlation

2006-11-11 Thread Robert Kern
robert wrote: > Is there a ready made function in numpy/scipy to compute the correlation > y=mx+o of an X and Y fast: > m, m-err, o, o-err, r-coef,r-coef-err ? numpy and scipy questions are best asked on their lists, not here. There are a number of people who know the capabilities of numpy and s

numpy/scipy: correlation

2006-11-11 Thread robert
Is there a ready made function in numpy/scipy to compute the correlation y=mx+o of an X and Y fast: m, m-err, o, o-err, r-coef,r-coef-err ? Or a formula to to compute the 3 error ranges? -robert PS: numpy.corrcoef computes only the bare coeff: >>> numpy.corrcoef((0,1,2,3.0),(2,5,6,7.0),) arra