Re: [R] Simple question about formulae in R!?

Bert Gunter Fri, 10 Aug 2012 09:43:09 -0700

Sheesh! Yes.
... and in the case where B is a factor with k levels and x is
continuous, the model ~B:x yields k+1 parameters, which in default
contrasts would be a constant term, x, and k-1 interactions between x
and the corresponding k-1 "contrasts"(which they aren't really) for B.
~B*x would add the k-1 B main effect contrasts.

But to be fair, this can get complicated and model.matrix() and
friends is a very sophisticated piece of software (certainly way
beyond me). This whole discussion, of course, raises the (OT!) issue
of the widespread misuse of linear modeling by those with insufficient
background in linear algebra to understand the points S. Ellison
discusses. I won't go there other than to say I have no clue what to
do about it (and I encounter it in my own practice!).

-- Bert

On Fri, Aug 10, 2012 at 9:16 AM, S Ellison <s.elli...@lgcgroup.com> wrote:
>> > R in general tries hard to prohibit this behavior (i.e.,  including an
>> > interaction but not the main effect). When removing a main effect and
>> > leaving the interaction, the number of parameters is not reduced by
>> > one (as would be expected) but stays the same, at least
>> > when using model.matrix:
>
> Surely this behaviour is less to do with a dislike of interactions without 
> both main effects (which we will necessarily use if we fit a simple 
> two-factor nested model) than the need to avoid non-uniqueness of a model 
> fitted with too many coefficients?
> In a simple case, an intercept plus n coefficients for n factor levels gives 
> us n+1 coefficients to find, and we only have n independent groups to 
> estimate them from. In model matrix terms we would have one column that is a 
> linear combination of others. For OLS normal equations that generates a zero 
> determinant and for the numerical methods R uses the effect is the same; no 
> useful fit. To avoid that and allow least squares fitting, R sets up the 
> model matrix with only n-1 coefficients in addition to the intercept. As a 
> result we end up with fewer model coefficients than we might have expected 
> (and that annoyingly missing first level that always puzzles newcomers the 
> first time we look at a linear model summary), but we have exactly the number 
> of coefficients that we can estimate uniquely from the groups we have 
> specified.
>
> S
>
> *******************************************************************
> This email and any attachments are confidential. Any u...{{dropped:22}}

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Simple question about formulae in R!?

Reply via email to