Re: [R] naive "collinear" weighted linear regression

David Winsemius Wed, 11 Nov 2009 19:07:16 -0800


On Nov 11, 2009, at 7:45 PM, Mauricio Calvao wrote:

Hi there

Sorry for what may be a naive or dumb question.

I have the following data:

> x <- c(1,2,3,4) # predictor vector
> y <- c(2,4,6,8) # response vector. Notice that it is an exact,perfect straight line through the origin and slope equal to 2
> error <- c(0.3,0.3,0.3,0.3) # I have (equal) ``errors'', forinstance, in the measured responses

Which means those x, y, and "error" figures did not come from anexperiment, but rather from theory???

Of course the best fit coefficients should be 0 for the interceptand 2 for the slope. Furthermore, it seems completely plausible (ornot?) that, since the y_i have associated non-vanishing``errors'' (dispersions), there should be corresponding non-vanishing ``errors'' associated to the best fit coefficients, right?
When I try:

> fit_mod <- lm(y~x,weights=1/error^2)

I get

Warning message:
In lm.fit(x, y, offset = offset, singular.ok = singular.ok, ...) :
 extra arguments weigths are just disregarded.

(Actually the weights are for adjusting for sampling, and I do notsee any sampling in your "design".)

Keeping on, despite the warning message, which I did not quiteunderstand, when I type:
> summary(fit_mod)

I get

Call:
lm(formula = y ~ x, weigths = 1/error^2)

Residuals:
        1          2          3          4
-5.067e-17  8.445e-17 -1.689e-17 -1.689e-17

Coefficients:
            Estimate Std. Error   t value Pr(>|t|)
(Intercept) 0.000e+00  8.776e-17 0.000e+00        1
x           2.000e+00  3.205e-17 6.241e+16   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 7.166e-17 on 2 degrees of freedom
Multiple R-squared:     1,      Adjusted R-squared:     1
F-statistic: 3.895e+33 on 1 and 2 DF,  p-value: < 2.2e-16
Naively, should not the column Std. Error be different from zero??What I have in mind, and sure is not what Std. Error means, is thatif I carried out a large simulation, assuming each response y_i aGaussian random variable with mean y_i and standard deviation2*error=0.6, and then making an ordinary least squares fitting ofthe slope and intercept, I would end up with a mean for thesesimulated coefficients which should be 2 and 0, respectively,

Well, not precisely 2 and 0, but rather something very close ... i.e,within "experimental error". Please note that numbers in the range of10e-17 are effectively zero from a numerical analysis perspective.


http://cran.r-project.org/doc/FAQ/R-FAQ.html#Why-doesn_0027t-R-think-these-numbers-are-equal_003f

> .Machine$double.eps ^ 0.5
[1] 1.490116e-08

and, that's the point, a non-vanishing standard deviation for thesefitted coefficients, right?? This somehow is what I expected shouldbe an estimate or, at least, a good indicator, of the degree ofuncertainty which I should assign to the fitted coefficients; itseems to me these deviations, thus calculated as a result of thesimulation, will certainly not be zero (or 3e-17, for that matter).So this Std. Error does not provide what I, naively, think should begiven as a measure of the uncertainties or errors in the fittedcoefficients...

You are trying to impose an error structure on a data situation thatyou constructed artificially to be perfect.


What am I not getting right??

That if you input "perfection" into R's linear regression program, youget appropriate warnings?


Thanks and sorry for the naive and non-expert question!

You are a Professor of physics, right? You do experiments, right? Youreplicate them. S0 perhaps I'm the one who should be puzzled.

--
#######################################
Prof. Mauricio Ortiz Calvao
Federal University of Rio de Janeiro
Institute of Physics, P O Box 68528
CEP 21941-972 Rio de Janeiro, RJ
Brazil

--

David Winsemius, MD
Heritage Laboratories
West Hartford, CT

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] naive "collinear" weighted linear regression

Reply via email to