I'm sorry -- I think I chose a bad example. Let me start over again: I want to estimate a moderated regression model of the following form: y = a*x1 + b*x2 + c*x1*x2 + e
Based on my understanding, including an interaction term (x1*x2) into the regression in addition to x1 and x2 leads to issues of multicollinearity, as x1*x2 is likely to covary to some degree with x1 (and x2). One recommendation I have seen in this context is to use mean centering, but apparently this does not solve the problem (see: Echambadi, Raj and James D. Hess (2007), "Mean-centering does not alleviate collinearity problems in moderated multiple regression models," Marketing science, 26 (3), 438 - 45). So my question is: Which R function can I use to estimate this type of model. Sorry for the confusion caused due to my previous message, Michael On Aug 3, 2010 3:42pm, David Winsemius <dwinsem...@comcast.net> wrote: > I think you are attributing to "collinearity" a problem that is due to > your small sample size. You are predicting 9 points with 3 predictor > terms, and incorrectly concluding that there is some "inconsistency" > because you get an R^2 that is above some number you deem surprising. (I > got values between 0.2 and 0.4 on several runs. > Try: > x1 > x2 > x3 > y > model > summary(model) > # Multiple R-squared: 0.04269 > -- > David. > On Aug 3, 2010, at 9:10 AM, Michael Haenlein wrote: > Dear all, > I have one dependent variable y and two independent variables x1 and x2 > which I would like to use to explain y. x1 and x2 are design factors in an > experiment and are not correlated with each other. For example assume > that: > x1 > x2 > cor(x1,x2) > The problem is that I do not only want to analyze the effect of x1 and x2 > on > y but also of their interaction x1*x2. Evidently this interaction term > has a > substantial correlation with both x1 and x2: > x3 > cor(x1,x3) > cor(x2,x3) > I therefore expect that a simple regression of y on x1, x2 and x1*x2 will > lead to biased results due to multicollinearity. For example, even when y > is > completely random and unrelated to x1 and x2, I obtain a substantial R2 > for > a simple linear model which includes all three variables. This evidently > does not make sense: > y > model > summary(model) > Is there some function within R or in some separate library that allows me > to estimate such a regression without obtaining inconsistent results? > Thanks for your help in advance, > Michael > Michael Haenlein > Associate Professor of Marketing > ESCP Europe > Paris, France > [[alternative HTML version deleted]] > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > David Winsemius, MD > West Hartford, CT [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.