On Nov 11, 2009, at 7:45 PM, Mauricio Calvao wrote:
Hi there
Sorry for what may be a naive or dumb question.
I have the following data:
> x <- c(1,2,3,4) # predictor vector
> y <- c(2,4,6,8) # response vector. Notice that it is an exact,
perfect straight line through the origin and slope equal to 2
> error <- c(0.3,0.3,0.3,0.3) # I have (equal) ``errors'', for
instance, in the measured responses
Which means those x, y, and "error" figures did not come from an
experiment, but rather from theory???
Of course the best fit coefficients should be 0 for the intercept
and 2 for the slope. Furthermore, it seems completely plausible (or
not?) that, since the y_i have associated non-vanishing
``errors'' (dispersions), there should be corresponding non-
vanishing ``errors'' associated to the best fit coefficients, right?
When I try:
> fit_mod <- lm(y~x,weights=1/error^2)
I get
Warning message:
In lm.fit(x, y, offset = offset, singular.ok = singular.ok, ...) :
extra arguments weigths are just disregarded.
(Actually the weights are for adjusting for sampling, and I do not
see any sampling in your "design".)
Keeping on, despite the warning message, which I did not quite
understand, when I type:
> summary(fit_mod)
I get
Call:
lm(formula = y ~ x, weigths = 1/error^2)
Residuals:
1 2 3 4
-5.067e-17 8.445e-17 -1.689e-17 -1.689e-17
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.000e+00 8.776e-17 0.000e+00 1
x 2.000e+00 3.205e-17 6.241e+16 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 7.166e-17 on 2 degrees of freedom
Multiple R-squared: 1, Adjusted R-squared: 1
F-statistic: 3.895e+33 on 1 and 2 DF, p-value: < 2.2e-16
Naively, should not the column Std. Error be different from zero??
What I have in mind, and sure is not what Std. Error means, is that
if I carried out a large simulation, assuming each response y_i a
Gaussian random variable with mean y_i and standard deviation
2*error=0.6, and then making an ordinary least squares fitting of
the slope and intercept, I would end up with a mean for these
simulated coefficients which should be 2 and 0, respectively,
Well, not precisely 2 and 0, but rather something very close ... i.e,
within "experimental error". Please note that numbers in the range of
10e-17 are effectively zero from a numerical analysis perspective.
http://cran.r-project.org/doc/FAQ/R-FAQ.html#Why-doesn_0027t-R-think-these-numbers-are-equal_003f
> .Machine$double.eps ^ 0.5
[1] 1.490116e-08
and, that's the point, a non-vanishing standard deviation for these
fitted coefficients, right?? This somehow is what I expected should
be an estimate or, at least, a good indicator, of the degree of
uncertainty which I should assign to the fitted coefficients; it
seems to me these deviations, thus calculated as a result of the
simulation, will certainly not be zero (or 3e-17, for that matter).
So this Std. Error does not provide what I, naively, think should be
given as a measure of the uncertainties or errors in the fitted
coefficients...
You are trying to impose an error structure on a data situation that
you constructed artificially to be perfect.
What am I not getting right??
That if you input "perfection" into R's linear regression program, you
get appropriate warnings?
Thanks and sorry for the naive and non-expert question!
You are a Professor of physics, right? You do experiments, right? You
replicate them. S0 perhaps I'm the one who should be puzzled.
--
#######################################
Prof. Mauricio Ortiz Calvao
Federal University of Rio de Janeiro
Institute of Physics, P O Box 68528
CEP 21941-972 Rio de Janeiro, RJ
Brazil
--
David Winsemius, MD
Heritage Laboratories
West Hartford, CT
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.