On 05/02/14 22:40, Marco Inacio wrote:
Hello all, can help clarify something?
According to R's lm() doc:
Non-NULL weights can be used to indicate that different observations
have different variances (with the values in weights being inversely
*proportional* to the variances); or equivalently, when the elements
of weights are positive integers w_i, that each response y_i is the
mean of w_i unit-weight observations (including the case that there
are w_i observations equal to y_i and the data have been summarized).
Since the idea here is *proportion*, not equality, shouldn't the vectors
of weights x, 2*x give the same result? And yet they don't, standard
errors differs:
summary(lm(c(1,2,3,1,2,3)~c(1,2.1,2.9,1.1,2,3),weight=rep(1,6)))$sigma
[1] 0.07108323
summary(lm(c(1,2,3,1,2,3)~c(1,2.1,2.9,1.1,2,3),weight=rep(2,6)))$sigma
[1] 0.1005269
The weights are in fact case weights, i.e., a weight of 2 is the same as
including the corresponding item twice. I agree that the documentation
is no wonder of clarity in this respect.
Btw, note that, in your example, (0.1005269 / 0.07108323)^2 = 2, your
constant weight.
Göran Broström
So what if I know a-priori, observation A has variance 2 times bigger
than observation B? Both weights=c(1,2) and weights=c(2,4) (and so on)
represent very well this knowledge, but we get different regression
(since sigma is different).
Also, if we do the same thing with a glm() model, than we get a lot of
other differences like in the deviance.
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.