Re: [R] Simple question about formulae in R!?

S Ellison Fri, 10 Aug 2012 09:19:18 -0700

> > R in general tries hard to prohibit this behavior (i.e.,  including an 
> > interaction but not the main effect). When removing a main effect and 
> > leaving the interaction, the number of parameters is not reduced by 
> > one (as would be expected) but stays the same, at least 
> > when using model.matrix:


Surely this behaviour is less to do with a dislike of interactions without both 
main effects (which we will necessarily use if we fit a simple two-factor 
nested model) than the need to avoid non-uniqueness of a model fitted with too 
many coefficients? 
In a simple case, an intercept plus n coefficients for n factor levels gives us 
n+1 coefficients to find, and we only have n independent groups to estimate 
them from. In model matrix terms we would have one column that is a linear 
combination of others. For OLS normal equations that generates a zero 
determinant and for the numerical methods R uses the effect is the same; no 
useful fit. To avoid that and allow least squares fitting, R sets up the model 
matrix with only n-1 coefficients in addition to the intercept. As a result we 
end up with fewer model coefficients than we might have expected (and that 
annoyingly missing first level that always puzzles newcomers the first time we 
look at a linear model summary), but we have exactly the number of coefficients 
that we can estimate uniquely from the groups we have specified.

S

*******************************************************************
This email and any attachments are confidential. Any use...{{dropped:8}}

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Simple question about formulae in R!?

Reply via email to