sorry, made a stupid mistake. I got it. thanks a lot! Peter Dalgaard wrote: > > rlearner309 wrote: >> I think it is zero, because you have lots of zeros there. It is not like >> continous variables. >> >> > Think again. The sum of products may be zero, but that is not the > covariance. And don't dismiss Thomas, he is usually right. > > Anyways, the coefs of dummy variables represent differences to the same > base level, and chosing a poorly determined base level (essentially: > whose mean is determined by only a few observations) will cause high > parameter correlation. It should only affect those parameters though, > and it is not really clear what VIF means for dummy variables. One often > choses to relevel() to make the largest group the base level, but it > really comes down to which group contrasts you want to look at. > > >> >> Thomas Lumley wrote: >> >>> On Wed, 2 Jul 2008, rlearner309 wrote: >>> >>> >>>> I think the covariance between dummy variables or between dummy >>>> variables >>>> and >>>> intercept should always be zero. meaning: no sigularity problem?? >>>> >>>> >>> No. You can easily check that this is not true using the cov() >>> function. >>> Indicator variables for mutually exclusive groups are negatively >>> correlated. >>> >>> -thomas >>> >>> >>> >>> >>>> rlearner309 wrote: >>>> >>>>> This is actually more like a Statistics problem: >>>>> I have a dataset with two dummy variables controlling three levels. >>>>> The >>>>> problem is, one level does not have many observations compared with >>>>> other >>>>> two levels (a couple of data points compared with 1000+ points on >>>>> other >>>>> levels). When I run the regression, the result is bad. I have >>>>> unbalanced >>>>> SE and VIF. Does this kind of problem also belong to "near >>>>> sigularity" >>>>> problem? Does it make any difference if I code the level that lacks >>>>> data >>>>> (0,0) in stead of (0,1)? >>>>> >>>>> thanks a lot! >>>>> >>>>> >>>> -- >>>> View this message in context: >>>> http://www.nabble.com/A-regression-problem-using-dummy-variables-tp18214377p18237666.html >>>> Sent from the R help mailing list archive at Nabble.com. >>>> >>>> ______________________________________________ >>>> R-help@r-project.org mailing list >>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>> PLEASE do read the posting guide >>>> http://www.R-project.org/posting-guide.html >>>> and provide commented, minimal, self-contained, reproducible code. >>>> >>>> >>> Thomas Lumley Assoc. Professor, Biostatistics >>> [EMAIL PROTECTED] University of Washington, Seattle >>> >>> ______________________________________________ >>> R-help@r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide >>> http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >>> >>> >>> >> >> > > > -- > O__ ---- Peter Dalgaard Ă˜ster Farimagsgade 5, Entr.B > c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K > (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 > ~~~~~~~~~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > >
-- View this message in context: http://www.nabble.com/A-regression-problem-using-dummy-variables-tp18214377p18260470.html Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.