Hello, I am looking at the job satisfaction data below, from a problem in Agresti's book, and I am not sure where the degrees of freedom come from. The way I am fitting a binomial model, I have 168 observations, so in my understanding that should also be the number of fitted parameters in the saturated model. Since I have one intercept parameter, I was thinking to get 167 df for the Null model, but R tells me it's 165. Where does this number come from?
Thanks in advance, Giovanni > ### Agresti, Problem 5.23 > race <- c("White", "Other") > gender <- c("M", "F") > age <- c("<35", "35-44", ">44") > loc <- c("NE", "MidAtl", "S", "MidW", "NW", "SW", "Pac") > sat <- factor(c("Yes", "No"), levels = c("No", "Yes")) > Freq <- c(288, 60, 224, 35, 337, 70, 38, 19, 32, 22, 21, 15, + 177, 57, 166, 19, 172, 30, 33, 35, 11, 20, 8, 10, + 90, 19, 96, 12, 124, 17, 18, 13, 7, 0, 9, 1, + 45, 12, 42, 5, 39, 2, 6, 7, 2, 3, 2, 1, + 226, 88, 189, 44, 156, 70, 45, 47, 18, 13, 11, 9, + 128, 57, 117, 34, 73, 25, 31, 35, 3, 7, 2, 2, + 285, 110, 225, 53, 324, 60, 40, 66, 19, 25, 22, 11, + 179, 93, 141, 24, 140, 47, 25, 56, 11, 19, 2, 12, + 270, 176, 215, 80, 269, 110, 36, 25, 9, 11, 16, 4, + 180, 151, 108, 40, 136, 40, 20, 16, 7, 5, 3, 5, + 252, 97, 162, 47, 199, 62, 69, 45, 14, 8, 14, 2, + 126, 61, 72, 27, 93, 24, 27, 36, 7, 4, 5, 0, + 119, 62, 66, 20, 67, 25, 45, 22, 15, 10, 8, 6, + 58, 33, 20, 10, 21, 10, 16, 15, 10, 8, 6, 2) > satdata <- data.frame(Freq, expand.grid(gender=gender, age=age, + race=race, sat=sat, loc=loc)) > sat.glm0 <- glm(sat ~ gender + age + race + loc, weights = Freq, + family = binomial, data = satdata) > summary(sat.glm0) Call: glm(formula = sat ~ gender + age + race + loc, family = binomial, data = satdata, weights = Freq) Deviance Residuals: Min 1Q Median 3Q Max -19.456 -6.839 0.000 6.309 17.635 Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) 0.334265 0.056491 5.917 3.28e-09 *** genderF -0.180480 0.047575 -3.794 0.000149 *** age35-44 0.122422 0.051836 2.362 0.018191 * age>44 0.361610 0.051576 7.011 2.36e-12 *** raceOther -0.005883 0.061605 -0.095 0.923919 locMidAtl 0.437342 0.103821 4.212 2.53e-05 *** locS 0.178574 0.073033 2.445 0.014481 * locMidW 0.083189 0.066427 1.252 0.210449 locNW 0.134337 0.067498 1.990 0.046563 * locSW 0.295874 0.073488 4.026 5.67e-05 *** locPac 0.425480 0.096561 4.406 1.05e-05 *** --- Signif. codes: 0 â***â 0.001 â**â 0.01 â*â 0.05 â.â 0.1 â â 1 (Dispersion parameter for binomial family taken to be 1) Null deviance: 12987 on 165 degrees of freedom Residual deviance: 12880 on 155 degrees of freedom AIC: 12902 Number of Fisher Scoring iterations: 4 > str(satdata) 'data.frame': 168 obs. of 6 variables: $ Freq : num 288 60 224 35 337 70 38 19 32 22 ... $ gender: Factor w/ 2 levels "M","F": 1 2 1 2 1 2 1 2 1 2 ... $ age : Factor w/ 3 levels "<35","35-44",..: 1 1 2 2 3 3 1 1 2 2 ... $ race : Factor w/ 2 levels "White","Other": 1 1 1 1 1 1 2 2 2 2 ... $ sat : Factor w/ 2 levels "No","Yes": 2 2 2 2 2 2 2 2 2 2 ... $ loc : Factor w/ 7 levels "NE","MidAtl",..: 1 1 1 1 1 1 1 1 1 1 ... > sessionInfo() R version 2.6.2 (2008-02-08) i686-pc-linux-gnu locale: LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=en_US.UTF-8;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base loaded via a namespace (and not attached): [1] tools_2.6.2 > -- Giovanni Petris <[EMAIL PROTECTED]> Associate Professor Department of Mathematical Sciences University of Arkansas - Fayetteville, AR 72701 Ph: (479) 575-6324, 575-8630 (fax) http://definetti.uark.edu/~gpetris/ ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.