I have a question about stepAIC and extractAIC and why they can produce different answers.
Here's a stepAIC result (slightly edited - I removed the warning about noninteger #successes): stepAIC(glm(formula = (Morbid_70_79/Present_70_79) ~ 1 + Cohort + Cohort2, family = binomial, data = ghs_70_79, subset = ghs_70_full),direction = c("backward")) Start: AIC=3151.41 (Morbid_70_79/Present_70_79) ~ 1 + Cohort + Cohort2 Df Deviance AIC <none> 1797.6 3151.4 - Cohort 1 1826.2 3178.0 - Cohort2 1 1826.3 3178.2 Call: glm(formula = (Morbid_70_79/Present_70_79) ~ 1 + Cohort + Cohort2, family = binomial, data = ghs_70_79, subset = ghs_70_full) Coefficients: (Intercept) Cohort Cohort2 -0.54094 0.35295 -0.01659 Degrees of Freedom: 2722 Total (i.e. Null); 2720 Residual (2015 observations deleted due to missingness) Null Deviance: 1826 Residual Deviance: 1798 AIC: 3151 based upon the above, note that the following models have these AIC scores: 1 + Cohort + Cohort2 3151.4 1 + Cohort2 3178.0 1 + Cohort 3178.2 Now consider the direct calculation of AIC > logLik(glm(formula = (Morbid_70_79/Present_70_79) ~ 1 + Cohort + >Cohort2, family = binomial, data = ghs_70_79, subset = ghs_70_full)) 'log Lik.' -1572.703 (df=3) > -2*-1572.703 + 6 [1] 3151.406 this matches the stepAIC result. > logLik(glm(formula = (Morbid_70_79/Present_70_79) ~ 1 + Cohort2, >family = binomial, data = ghs_70_79, subset = ghs_70_full)) 'log Lik.' -1599.126 (df=2) > -2*-1599.126 + 4 [1] 3202.252 this does not match the stepAIC result (= 3178.0). > logLik(glm(formula = (Morbid_70_79/Present_70_79) ~ 1 + Cohort, >family = binomial, data = ghs_70_79, subset = ghs_70_full)) 'log Lik.' -1599.264 (df=2) > -2*-1599.264 + 4 [1] 3202.528 this does not match the stepAIC result (=3178.2). as you know, stepAIC uses extractAIC, e.g. > extractAIC(glm(formula = (Morbid_70_79/Present_70_79) ~ 1 + Cohort, >family = binomial, data = ghs_70_79, subset = ghs_70_full)) [1] 2.000 3202.527 why are the AIC results from stepAIC different from those calculated directly? of course, AIC is only calculated up to an arbitrary constant. So, the issue is that some of the AIC values match and some don't. many thanks! -- Steven Orzack The Fresh Pond Research Institute 173 Harvey Street Cambridge, MA. 02140 617 864-4307 www.freshpond.org [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.