Murray Jorgensen wrote:
I am wondering how to interpret the parameter estimates that lm()
reports in this sort of situation:
y = round(rnorm(n=24,mean=5,sd=2),2)
A = gl(3,2,24,labels=c("one","two","three"))
B = gl(4,6,24,labels=c("i","ii","iii","iv"))
# Make both observations for A=1, B=4 missing
y[19] = NA
y[20] = NA
data.frame(y,A,B)
nonadd = lm(y ~ A * B)
summary(nonadd)
Call:
lm(formula = y ~ A * B)
Residuals:
Min 1Q Median 3Q Max
-3.555e+00 -7.675e-01 -6.939e-17 7.675e-01 3.555e+00
Coefficients: (1 not defined because of singularities)
Estimate Std. Error t value Pr(>|t|)
(Intercept) 3.755 1.667 2.252 0.0457 *
Atwo 1.655 2.358 0.702 0.4974
Athree 3.330 2.358 1.412 0.1856
Bii 1.435 2.358 0.609 0.5552
Biii 2.055 2.358 0.871 0.4021
Biv -1.635 2.358 -0.693 0.5025
Atwo:Bii -1.145 3.335 -0.343 0.7378
Athree:Bii -4.535 3.335 -1.360 0.2011
Atwo:Biii -3.230 3.335 -0.969 0.3536
Athree:Biii -2.105 3.335 -0.631 0.5408
Atwo:Biv 1.655 3.335 0.496 0.6295
Athree:Biv NA NA NA NA
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 2.358 on 11 degrees of freedom
(2 observations deleted due to missingness)
Multiple R-squared: 0.2797, Adjusted R-squared: -0.3752
F-statistic: 0.4271 on 10 and 11 DF, p-value: 0.9044
fitted(nonadd)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 21
3.755 3.755 5.410 5.410 7.085 7.085 5.190 5.190 5.700 5.700 3.985 3.985
5.810 5.810 4.235 4.235 7.035 7.035 5.430
22 23 24
5.430 5.450 5.450
t(model.matrix(nonadd)%*%coef(nonadd))
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 21 22 23 24
[1,] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
I guess that the parameter estimates reported are linear combinations of
the cell means, but which linear combinations and how does lm() decide
what parameters to report?
Cheers, Murray
What's the problem? The parameters are defined as usual for the two-way
layout:
The intercept is the fitted value in the top left corner
The A coefficients are the fitted values in the first column minus the
intercept.
The B coefficients vice versa.
The interaction coefficients are the fitted values minus the sum of the
the intercept and the corresponding A and B coefficients.
One interaction coefficient is set missing because you have no data, but
except for that, the fitted values equal the cell means.
--
O__ ---- Peter Dalgaard Øster Farimagsgade 5, Entr.B
c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
(*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalga...@biostat.ku.dk) FAX: (+45) 35327907
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.