Thanks, Bert. It seems I got these NAs because I already had MSA population controlled for in my model, besides the fixed effect variable, which led to overestimation. Those NAs disappeared after I dropped the population variable.
Gary On Sun, Dec 1, 2013 at 10:27 AM, Bert Gunter <gunter.ber...@gene.com> wrote: > You may wish to talk to a local statistician or read up on linear > models, as you appear to not understand some basics. Anyway, either > > 1. You have other covariates in your model that you haven't shown and > your model is overdetermined. > 2. You have NA's in your data that causes 1) to occur. > > As an example of the above: > > x <- rep(letters[1:3],e=5) > y <- factor(rep(1:3,c(5,8,2))) > summary(lm(rnorm(15)~x+y)) > > Call: > lm(formula = rnorm(15) ~ x + y) > > Residuals: > Min 1Q Median 3Q Max > -1.6768 -0.3865 -0.1108 0.3090 1.9632 > > Coefficients: (1 not defined because of singularities) > Estimate Std. Error t value Pr(>|t|) > (Intercept) 0.04138 0.47160 0.088 0.932 > xb 1.59259 1.17111 1.360 0.201 > xc 0.36822 0.88228 0.417 0.684 > y2 -1.58517 0.96264 -1.647 0.128 > y3 NA NA NA NA > > > Incidentally, I was surprised to find in R3.0.2 that if some levels of > a factor are missing either due to NA's in the response or otherwise, > R estimates the coefficients for the remaining factor levels quite > nicely. I expected it to complain, but it did not. Maybe it has always > been so nicely behaved -- I don't fit overdetermined models and take > care that my factor levels are actually present, so don't run into > trouble. But if this is newish behavior and you are using an oldish > version, you might try upgrading to the current version. Or (more > likely) both clauses of this conditional are false and should be > ignored, and I should preemptively apologize for my foolishness. > > Cheers, > Bert > > On Sun, Dec 1, 2013 at 9:48 AM, Gary Dong <pdxgary...@gmail.com> wrote: > > Dear R users, > > > > I am running a linear regression in R. My observations are Census Tracts > in > > several metropolitan areas (MSAs). In my data set, each MSA has at least > 50 > > observations. I use factor(msa_code) in the lm formula to control for > > metropolitan fixed effects. But I kept getting something like this: > > > > ..... > > factor(msa_code)12420 4.910e-01 1.517e-01 3.237 0.001221 ** > > factor(msa_code)12580 1.966e-01 6.861e-02 2.865 0.004194 ** > > factor(msa_code)14460 -3.892e-02 1.653e-02 -2.355 0.018601 * > > factor(msa_code)16980 -2.873e-01 3.278e-02 -8.764 < 2e-16 *** > > factor(msa_code)17140 1.088e-01 6.771e-02 1.607 0.108127 > > factor(msa_code)17460 -1.173e-01 4.380e-02 -2.678 0.007441 ** > > factor(msa_code)19100 1.368e-01 5.550e-02 2.465 0.013753 * > > factor(msa_code)19740 5.819e-01 1.173e-01 4.962 7.33e-07 *** > > factor(msa_code)19820 -4.214e-01 6.641e-02 -6.346 2.51e-10 *** > > factor(msa_code)26420 1.258e-01 7.541e-02 1.668 0.095486 . > > factor(msa_code)28140 2.010e-01 3.847e-02 5.224 1.85e-07 *** > > factor(msa_code)29820 7.102e-02 6.593e-02 1.077 0.281435 > > factor(msa_code)31100 -4.832e-01 1.088e-01 -4.440 9.28e-06 *** > > factor(msa_code)33100 -2.534e-01 6.391e-02 -3.965 7.49e-05 *** > > factor(msa_code)33460 5.229e-02 7.891e-02 0.663 0.507609 > > factor(msa_code)35620 -3.197e-01 7.565e-02 -4.225 2.45e-05 *** > > factor(msa_code)36740 1.269e-01 6.948e-02 1.826 0.067868 . > > factor(msa_code)37980 1.394e-01 4.388e-02 3.178 0.001497 ** > > factor(msa_code)38060 -6.935e-02 6.124e-02 -1.132 0.257540 > > factor(msa_code)38300 1.647e-01 3.986e-02 4.133 3.67e-05 *** > > factor(msa_code)38900 2.605e-01 1.420e-01 1.835 0.066664 . > > factor(msa_code)39300 -9.612e-02 4.704e-02 -2.043 0.041103 * > > factor(msa_code)40140 -2.353e-01 3.562e-02 -6.605 4.59e-11 *** > > factor(msa_code)40900 NA NA NA NA > > factor(msa_code)41740 NA NA NA NA > > factor(msa_code)41860 NA NA NA NA > > factor(msa_code)42660 NA NA NA NA > > factor(msa_code)45300 NA NA NA NA > > factor(msa_code)47900 NA NA NA NA > > > > I wonder why I kep getting those "NAs". Thank you! > > > > Gary > > > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-help@r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > > > -- > > Bert Gunter > Genentech Nonclinical Biostatistics > > (650) 467-7374 > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.