On Nov 14, 2009, at 1:50 PM, Mauricio O Calvao wrote:
David Winsemius <dwinsemius <at> comcast.net> writes:
Which means those x, y, and "error" figures did not come from an
experiment, but rather from theory???
The fact is I am trying to compare the results of:
(1) lm under R and
(2) the Java applet at http://omnis.if.ufrj.br/~carlos/applets/reta/reta.html
(3) the Fit method of the ROOT system used by CERN,
(4) the Numerical Recipes functions for weighted linear regression
The three last ones all provide, for the "fake" data set I furnished
in my first
post in this thread, the same results; particularly they give erros or
uncertainties in the estimated coefficients of intercept and slope
which, as
seems intuitive, are not zero at all, but of the order 0.1 or 0.2,
whereas the
method lm under R issues a "Std. Error", which is zero.
Independently of
terminology, which sure is of utmost importance, the data I provided
should give
rise to a best fit straight line with intercept zero and slope 2,
but also with
non-vanishing errors associated with them. How do I get this from
lm????
I only want, for instance, calculation of the so-called covariance
matrix for
the estimated coefficients, as given, for instance, in Equation
(2.3.2) of the
second edition of Draper and Smith, "Applied regression analysis";
this is a
standard statistical result, right? So why does R not directly
provide it as a
summary from an lm object???
It's really not that difficult to get the variance covariance matrix.
What is not so clear is why you think differential weighting of a set
that has a perfect fit should give meaningfully different results than
a fit that has no weights.
?lm
?vcov
> y <- c(2,4,6,8) # response vect
> fit_mod <- lm(y~x,weights=1/error^2)
Error in eval(expr, envir, enclos) : object 'error' not found
> error <- c(0.3,0.3,0.3,0.3)
> fit_mod <- lm(y~x,weights=1/error^2)
> vcov(fit_mod)
(Intercept) x
(Intercept) 2.396165e-30 -7.987217e-31
x -7.987217e-31 3.194887e-31
Numerically those are effectively zero.
> fit_mod <- lm(y~x)
> vcov(fit_mod)
(Intercept) x
(Intercept) 0 0
x 0 0
--
David.
Of course the best fit coefficients should be 0 for the intercept
and 2 for the slope. Furthermore, it seems completely plausible (or
not?) that, since the y_i have associated non-vanishing
``errors'' (dispersions), there should be corresponding non-
vanishing ``errors'' associated to the best fit coefficients, right?
When I try:
fit_mod <- lm(y~x,weights=1/error^2)
I get
Warning message:
In lm.fit(x, y, offset = offset, singular.ok = singular.ok, ...) :
extra arguments weigths are just disregarded.
(Actually the weights are for adjusting for sampling, and I do not
see any sampling in your "design".)
Keeping on, despite the warning message, which I did not quite
understand, when I type:
summary(fit_mod)
I get
Call:
lm(formula = y ~ x, weigths = 1/error^2)
Residuals:
1 2 3 4
-5.067e-17 8.445e-17 -1.689e-17 -1.689e-17
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.000e+00 8.776e-17 0.000e+00 1
x 2.000e+00 3.205e-17 6.241e+16 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 7.166e-17 on 2 degrees of freedom
Multiple R-squared: 1, Adjusted R-squared: 1
F-statistic: 3.895e+33 on 1 and 2 DF, p-value: < 2.2e-16
Naively, should not the column Std. Error be different from zero??
What I have in mind, and sure is not what Std. Error means, is that
if I carried out a large simulation, assuming each response y_i a
Gaussian random variable with mean y_i and standard deviation
2*error=0.6, and then making an ordinary least squares fitting of
the slope and intercept, I would end up with a mean for these
simulated coefficients which should be 2 and 0, respectively,
Well, not precisely 2 and 0, but rather something very close ... i.e,
within "experimental error". Please note that numbers in the range of
10e-17 are effectively zero from a numerical analysis perspective.
http://cran.r-project.org/doc/FAQ/R-FAQ.html#Why-doesn_0027t-R-think-these-numbers-are-equal_003f
.Machine$double.eps ^ 0.5
[1] 1.490116e-08
I know this all too well and it is obviously a trivial supernewbie
issue, which
I have already overcome a long time ago...
and, that's the point, a non-vanishing standard deviation for these
fitted coefficients, right?? This somehow is what I expected should
be an estimate or, at least, a good indicator, of the degree of
uncertainty which I should assign to the fitted coefficients; it
seems to me these deviations, thus calculated as a result of the
simulation, will certainly not be zero (or 3e-17, for that matter).
So this Std. Error does not provide what I, naively, think should be
given as a measure of the uncertainties or errors in the fitted
coefficients...
You are trying to impose an error structure on a data situation that
you constructed artificially to be perfect.
What am I not getting right??
That if you input "perfection" into R's linear regression program,
you
get appropriate warnings?
Thanks and sorry for the naive and non-expert question!
You are a Professor of physics, right? You do experiments, right? You
replicate them. S0 perhaps I'm the one who should be puzzled.
Unfortunately you eschewed answering objectively any of my
questions; I insist
they do make sense. Don't mention the data are perfect; this does
not help to
make any progress in understanding the choice of convenient summary
info the lm
method provides, as compared to what, in my humble opinion and in
this specific
particular case, it should provide: the covariance matrix of the
estimated
coefficients...
--
#######################################
Prof. Mauricio Ortiz Calvao
Federal University of Rio de Janeiro
Institute of Physics, P O Box 68528
CEP 21941-972 Rio de Janeiro, RJ
Brazil
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
David Winsemius, MD
Heritage Laboratories
West Hartford, CT
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.