Christopher David Desjardins <desja004 <at> umn.edu> writes: > > Hi, > I am running a Cox Mixed Effects Hazard model using the library coxme. I > am trying to model time to onset (age_sym1) of thought problems (e.g. > hearing voices) (sym1). As I have siblings in my dataset, I have > decided to account for this by including a random effect for family > (famid). My covariate of interest is Mother's diagnosis where a 0 is > bipolar, 1 is control, and 2 is major depression. I am trying to fit > the following model. > > thorp1 <- coxme(Surv(age_sym1, sym1) ~ lifedxm + (1 | famid), data = > bip.surv) > > Which provides the following output: > > ------------------------------------------------- > > thorp1 > Cox mixed-effects model fit by maximum likelihood > Data: bip.surv > events, n = 99, 189 (3 observations deleted due to missingness) > Iterations= 10 54 > NULL Integrated Penalized > Log-likelihood -479.0372 -467.3549 -435.2096 > Chisq df p AIC BIC > Integrated loglik 23.36 3.00 3.3897e-05 17.36 9.58 > Penalized loglik 87.66 47.27 3.2374e-04 -6.88 -129.54 > Model: Surv(age_sym1, sym1) ~ lifedxm + (1 | famid) > Fixed coefficients > coef exp(coef) se(coef) z p > lifedxmCONTROL -1.0576257 0.3472794 0.3872527 -2.73 0.0063 > lifedxmMAJOR -0.6329957 0.5309987 0.3460847 -1.83 0.0670 > Random effects > Group Variable Std Dev Variance > famid Intercept 0.9877770 0.9757033 > > -------------------------------------------------- > > So it appears that there is a difference in hazard rates associated with > Mother's diagnosis between BIPOLAR and CONTROL. Just to be safe, I fit a > model without this covariate present and decided to compare the models > with AIC and see if fit decreased with this covariate not in the model. > > ---------------------------------------------------------- > thorp1.n <- coxme(Surv(age_sym1, sym1) ~ (1 | famid), data = bip.surv) > > thorp1.n > Cox mixed-effects model fit by maximum likelihood > Data: bip.surv > events, n = 99, 189 (3 observations deleted due to missingness) > Iterations= 10 54 > NULL Integrated Penalized > Log-likelihood -479.0372 -471.3333 -436.0478 > Chisq df p AIC BIC > Integrated loglik 15.41 1.00 0.00008663 13.41 10.81 > Penalized loglik 85.98 49.48 0.00099915 -12.97 -141.37 > Model: Surv(age_sym1, sym1) ~ (1 | famid) > Random effects > Group Variable Std Dev Variance > famid Intercept 1.050949 1.104495 > ------------------------------------------------------------ > > The problem is that the AIC for the model w/o lifedxm is 13.41 and the > model w/ lifedxm is 17.36. So fit actually improved without lifedxm in > the model but really the fit is indistinguishable if I use Burnham & > Anderson, 2002's criteria. > > So my conundrum is this - The AIC and the p-values appear to disagree. > The p-value would suggest that lifedxm should be included in the model > and the AIC says it doesn't improve fit. In general, I tend to favor the > AIC (I usually work with large sample sizes and multilevel models) but I > am just beginning with survival models and I am curious if a survival > model guru might suggest whether lifedxm needs to be in the model or not > based on my results here? Would people generally use the AIC in this > situation? Also, I can't run anova() on this models because of the > random effect. > > I am happy to provide the data if necessary. > > Please cc me on a reply. > > Thanks, > Chris > >
Hi Chris, Did you get an answer to why the p-value and AIC score disagreed? I am getting the same results with my own data. They're not small disagreements either. The AIC score is telling me the opposite of what the p-value and the parameter coef are saying. The p-value and the coef for the predictor variable are in agreement. I've also noticed in some published papers with tables containing p-values and AIC scores that the chisq p-value and AIC score are in opposition too. This makes me think I'm missing something and that this all actually makes sense. However given that AIC = − 2 (log L) + 2K (where K is the number of parameters) and seeing as how the log-likelihood scores below are in the hundreds, shouldn't the AIC score be in the hundreds as well?? -------------------------------------- model0 (intercept only model) NULL Integrated Penalized Log-likelihood -119.8470 -117.9749 -113.1215 Chisq df p AIC BIC Integrated loglik 3.74 1.00 0.052989 1.74 0.08 Penalized loglik 13.45 7.06 0.063794 -0.68 -12.43 Random effects Group Variable Std Dev Variance Site Intercept 0.6399200 0.4094977 _____ model1 (before vs after) NULL Integrated Penalized Log-likelihood -119.8470 -112.1598 -108.1663 Chisq df p AIC BIC Integrated loglik 15.37 2.00 0.00045863 11.37 8.05 Penalized loglik 23.36 7.06 0.00153710 9.25 -2.49 Fixed coefficients Coef exp(coef) se(coef) z p Time -1.329997 0.2644779 0.4009229 -3.32 0.00091 Random effects Group Variable Std Dev Variance Site Intercept 0.5625206 0.3164294 ______ model2 (stim1 vs stim2) NULL Integrated Penalized Log-likelihood -119.8470 -117.8677 -113.167 Chisq df p AIC BIC Integrated loglik 3.96 2.00 0.138160 -0.04 -3.37 Penalized loglik 13.36 7.83 0.093314 -2.31 -15.34 Fixed coefficients coef exp(coef) se(coef) z p stim 0.1621367 1.176021 0.3534788 0.46 0.65 Random effects Group Variable Std Dev Variance Site Intercept 0.6262600 0.3922016 ---------------------------------------------- If you didn't get an answer, hopefully, someone that understands all this will reply for both of us. Thanks, -Teresa ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.