Thanks, Bert. It seems I got these NAs because I already had MSA population
controlled for in my model, besides the fixed effect variable, which led to
overestimation. Those NAs disappeared after I dropped the population
variable.

Gary


On Sun, Dec 1, 2013 at 10:27 AM, Bert Gunter <gunter.ber...@gene.com> wrote:

> You may wish to talk to a local statistician or read up on linear
> models, as you appear to not understand some basics. Anyway,  either
>
> 1. You have other covariates in your model that you haven't shown and
> your model is overdetermined.
> 2. You have NA's in your data that causes 1) to occur.
>
> As an example of the above:
>
> x <- rep(letters[1:3],e=5)
> y <- factor(rep(1:3,c(5,8,2)))
> summary(lm(rnorm(15)~x+y))
>
> Call:
> lm(formula = rnorm(15) ~ x + y)
>
> Residuals:
>     Min      1Q  Median      3Q     Max
> -1.6768 -0.3865 -0.1108  0.3090  1.9632
>
> Coefficients: (1 not defined because of singularities)
>             Estimate Std. Error t value Pr(>|t|)
> (Intercept)  0.04138    0.47160   0.088    0.932
> xb           1.59259    1.17111   1.360    0.201
> xc           0.36822    0.88228   0.417    0.684
> y2          -1.58517    0.96264  -1.647    0.128
> y3                NA         NA      NA       NA
>
>
> Incidentally, I was surprised to find in R3.0.2 that if some levels of
> a factor are missing either due to NA's in the response or otherwise,
> R estimates the coefficients for the remaining factor levels quite
> nicely. I expected it to complain, but it did not. Maybe it has always
> been so nicely behaved -- I don't fit overdetermined models and take
> care that my factor levels are actually present, so don't run into
> trouble. But if this is newish behavior and you are using an oldish
> version, you might try upgrading to the current version. Or (more
> likely) both clauses of this conditional are false and should be
> ignored, and I should preemptively apologize for my foolishness.
>
> Cheers,
> Bert
>
> On Sun, Dec 1, 2013 at 9:48 AM, Gary Dong <pdxgary...@gmail.com> wrote:
> > Dear R users,
> >
> > I am running a linear regression in R. My observations are Census Tracts
> in
> > several metropolitan areas (MSAs). In my data set, each MSA has at least
> 50
> > observations. I use factor(msa_code) in the lm formula to control for
> > metropolitan fixed effects. But I kept getting something like this:
> >
> > .....
> > factor(msa_code)12420  4.910e-01  1.517e-01   3.237 0.001221 **
> > factor(msa_code)12580  1.966e-01  6.861e-02   2.865 0.004194 **
> > factor(msa_code)14460 -3.892e-02  1.653e-02  -2.355 0.018601 *
> > factor(msa_code)16980 -2.873e-01  3.278e-02  -8.764  < 2e-16 ***
> > factor(msa_code)17140  1.088e-01  6.771e-02   1.607 0.108127
> > factor(msa_code)17460 -1.173e-01  4.380e-02  -2.678 0.007441 **
> > factor(msa_code)19100  1.368e-01  5.550e-02   2.465 0.013753 *
> > factor(msa_code)19740  5.819e-01  1.173e-01   4.962 7.33e-07 ***
> > factor(msa_code)19820 -4.214e-01  6.641e-02  -6.346 2.51e-10 ***
> > factor(msa_code)26420  1.258e-01  7.541e-02   1.668 0.095486 .
> > factor(msa_code)28140  2.010e-01  3.847e-02   5.224 1.85e-07 ***
> > factor(msa_code)29820  7.102e-02  6.593e-02   1.077 0.281435
> > factor(msa_code)31100 -4.832e-01  1.088e-01  -4.440 9.28e-06 ***
> > factor(msa_code)33100 -2.534e-01  6.391e-02  -3.965 7.49e-05 ***
> > factor(msa_code)33460  5.229e-02  7.891e-02   0.663 0.507609
> > factor(msa_code)35620 -3.197e-01  7.565e-02  -4.225 2.45e-05 ***
> > factor(msa_code)36740  1.269e-01  6.948e-02   1.826 0.067868 .
> > factor(msa_code)37980  1.394e-01  4.388e-02   3.178 0.001497 **
> > factor(msa_code)38060 -6.935e-02  6.124e-02  -1.132 0.257540
> > factor(msa_code)38300  1.647e-01  3.986e-02   4.133 3.67e-05 ***
> > factor(msa_code)38900  2.605e-01  1.420e-01   1.835 0.066664 .
> > factor(msa_code)39300 -9.612e-02  4.704e-02  -2.043 0.041103 *
> > factor(msa_code)40140 -2.353e-01  3.562e-02  -6.605 4.59e-11 ***
> > factor(msa_code)40900         NA         NA      NA       NA
> > factor(msa_code)41740         NA         NA      NA       NA
> > factor(msa_code)41860         NA         NA      NA       NA
> > factor(msa_code)42660         NA         NA      NA       NA
> > factor(msa_code)45300         NA         NA      NA       NA
> > factor(msa_code)47900         NA         NA      NA       NA
> >
> >  I wonder why I kep getting those "NAs". Thank you!
> >
> > Gary
> >
> >         [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
>
>
> --
>
> Bert Gunter
> Genentech Nonclinical Biostatistics
>
> (650) 467-7374
>

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to