Re: [R] two-factor linear models with missing cells

David Winsemius Sun, 02 Aug 2009 16:00:57 -0700

Does this help at all?

<after your code...>


> contrasts(A)
      two three
one     0     0
two     1     0
three   0     1
> contrasts(B)
    ii iii iv
i    0   0  0
ii   1   0  0
iii  0   1  0
iv   0   0  1

> contrasts(A:B)

one:ii one:iii one:iv two:i two:ii two:iii two:iv three:ithree:ii three:iii three:ivone:i 0 0 0 0 0 0 00 0 0 0one:ii 1 0 0 0 0 0 00 0 0 0one:iii 0 1 0 0 0 0 00 0 0 0one:iv 0 0 1 0 0 0 00 0 0 0two:i 0 0 0 1 0 0 00 0 0 0two:ii 0 0 0 0 1 0 00 0 0 0two:iii 0 0 0 0 0 1 00 0 0 0two:iv 0 0 0 0 0 0 10 0 0 0three:i 0 0 0 0 0 0 01 0 0 0three:ii 0 0 0 0 0 0 00 1 0 0three:iii 0 0 0 0 0 0 00 0 1 0three:iv 0 0 0 0 0 0 00 0 0 1

--
David
On Aug 2, 2009, at 6:40 PM, Murray Jorgensen wrote:

I am wondering how to interpret the parameter estimates that lm()
reports in this sort of situation:

y = round(rnorm(n=24,mean=5,sd=2),2)
A = gl(3,2,24,labels=c("one","two","three"))
B = gl(4,6,24,labels=c("i","ii","iii","iv"))
# Make both observations for A=1, B=4 missing
y[19] = NA
y[20] = NA
data.frame(y,A,B)
nonadd = lm(y ~ A * B)

summary(nonadd)


Call:
lm(formula = y ~ A * B)

Residuals:
Min 1Q Median 3Q Max
-3.555e+00 -7.675e-01 -6.939e-17 7.675e-01 3.555e+00

Coefficients: (1 not defined because of singularities)
Estimate Std. Error t value Pr(>|t|)
(Intercept) 3.755 1.667 2.252 0.0457 *
Atwo 1.655 2.358 0.702 0.4974
Athree 3.330 2.358 1.412 0.1856
Bii 1.435 2.358 0.609 0.5552
Biii 2.055 2.358 0.871 0.4021
Biv -1.635 2.358 -0.693 0.5025
Atwo:Bii -1.145 3.335 -0.343 0.7378
Athree:Bii -4.535 3.335 -1.360 0.2011
Atwo:Biii -3.230 3.335 -0.969 0.3536
Athree:Biii -2.105 3.335 -0.631 0.5408
Atwo:Biv 1.655 3.335 0.496 0.6295
Athree:Biv NA NA NA NA
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 2.358 on 11 degrees of freedom
(2 observations deleted due to missingness)
Multiple R-squared: 0.2797, Adjusted R-squared: -0.3752
F-statistic: 0.4271 on 10 and 11 DF, p-value: 0.9044

fitted(nonadd)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 21

3.755 3.755 5.410 5.410 7.085 7.085 5.190 5.190 5.700 5.700 3.9853.985

5.810 5.810 4.235 4.235 7.035 7.035 5.430
22 23 24
5.430 5.450 5.450

t(model.matrix(nonadd)%*%coef(nonadd))

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 21 22 23 24
[1,] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA

I guess that the parameter estimates reported are linearcombinations of

the cell means, but which linear combinations and how does lm() decide
what parameters to report?

Cheers, Murray

--
Dr Murray Jorgensen      http://www.stats.waikato.ac.nz/Staff/maj.html
Department of Statistics, University of Waikato, Hamilton, New Zealand


David Winsemius, MD
Heritage Laboratories
West Hartford, CT

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] two-factor linear models with missing cells

Reply via email to