2008/7/16 Dimitris Rizopoulos <[EMAIL PROTECTED]>: > well, for computing the p-value you need to use pchisq() and dchisq() (check > ?dchisq for more info). For model fits with a logLik method you can directly > use the following simple function: > > lrt <- function (obj1, obj2) { > L0 <- logLik(obj1) > L1 <- logLik(obj2) > L01 <- as.vector(- 2 * (L0 - L1)) > df <- attr(L1, "df") - attr(L0, "df") > list(L01 = L01, df = df, > "p-value" = pchisq(L01, df, lower.tail = FALSE)) > } > > library(lme4) > gm0 <- glm(cbind(incidence, size - incidence) ~ period, > family = binomial, data = cbpp) > gm1 <- glmer(cbind(incidence, size - incidence) ~ period + (1 | herd), > family = binomial, data = cbpp) > > lrt(gm0, gm1)
Yes, that seems quite natural, but then try to compare with the deviance: logLik(gm0) logLik(gm1) (d0 <- deviance(gm0)) (d1 <- deviance(gm1)) (LR <- d0 - d1) pchisq(LR, 1, lower = FALSE) Obviously the deviance in glm is *not* twice the negative log-likelihood as it is in glmer. The question remains which of these two quantities is appropriate for comparison. I am not sure exactly how the deviance and/or log-likelihood are calculated in glmer, but my feeling is that one should trust the deviance rather than the log-likelihoods for these purposes. This is supported by the following comparison: Ad an arbitrary random effect with a close-to-zero variance and note the deviance: tmp <- rep(1:4, each = nrow(cbpp)/4) gm2 <- glmer(cbind(incidence, size - incidence) ~ period + (1 | tmp), family = binomial, data = cbpp) (d2 <- deviance(gm2)) This deviance is very close to that obtained from the glm model. I have included the mixed-models mailing list in the hope that someone could explain how the deviance is computed in glmer and why deviances, but not likelihoods are comparable to glm-fits. Best Rune > > > However, there are some issues regarding this likelihood ratio test. > > 1) The null hypothesis is on the boundary of the parameter space, i.e., you > test whether the variance for the random effect is zero. For this case the > assumed chi-squared distribution for the LRT may *not* be totally > appropriate and may produce conservative p-values. There is some theory > regarding this issue, which has shown that the reference distribution for > the LRT in this case is a mixture of a chi-squared(df = 0) and > chi-squared(df = 1). Another option is to use simulation-based approach > where you can approximate the reference distribution of the LRT under the > null using simulation. You may check below for an illustration of this > procedure (not-tested): > > X <- model.matrix(gm0) > coefs <- coef(gm0) > pr <- plogis(c(X %*% coefs)) > n <- length(pr) > new.dat <- cbpp > Tobs <- lrt(gm0, gm1)$L01 > B <- 200 > out.T <- numeric(B) > for (b in 1:B) { > y <- rbinom(n, cbpp$size, pr) > new.dat$incidence <- y > fit0 <- glm(formula(gm0), family = binomial, data = new.dat) > fit1 <- glmer(formula(gm1), family = binomial, data = new.dat) > out.T[b] <- lrt(fit0, fit1)$L01 > } > # estimate p-value > (sum(out.T >= Tobs) + 1) / (B + 1) > > > 2) For the glmer fit you have to note that you work with an approximation to > the log-likelihood (obtained using numerical integration) and not the actual > log-likelihood. > > I hope it helps. > > Best, > Dimitris > > ---- > Dimitris Rizopoulos > Biostatistical Centre > School of Public Health > Catholic University of Leuven > > Address: Kapucijnenvoer 35, Leuven, Belgium > Tel: +32/(0)16/336899 > Fax: +32/(0)16/337015 > Web: http://med.kuleuven.be/biostat/ > http://www.student.kuleuven.be/~m0390867/dimitris.htm > > > Quoting COREY SPARKS <[EMAIL PROTECTED]>: > >> Dear list, >> I am fitting a logistic multi-level regression model and need to test the >> difference between the ordinary logistic regression from a glm() fit and >> the mixed effects fit from glmer(), basically I want to do a likelihood >> ratio test between the two fits. >> >> >> The data are like this: >> My outcome is a (1,0) for health status, I have several (1,0) dummy >> variables RURAL, SMOKE, DRINK, EMPLOYED, highereduc, INDIG, male, >> divorced, SINGLE, chronic, vigor_d and moderat_d and AGE is continuous (20 >> to 100). >> My higher level is called munid and has 581 levels. >> The data have 45243 observations. >> >> Here are my program statements: >> >> #GLM fit >> >> ph.fit.2<-glm(poorhealth~RURAL+SMOKE+DRINK+EMPLOYED+highereduc+INDIG+AGE+male+divorced+SINGLE+chronic+vigor_d+moderat_d,family=binomial(), >> data=mx.merge) >> #GLMER fit >> >> ph.fit.3<-glmer(poorhealth~RURAL+SMOKE+DRINK+EMPLOYED+INSURANCE+highereduc+INDIG+AGE+male+divorced+SINGLE+chronic+vigor_d+moderat_d+(1|munid),family=binomial(), >> data=mx.merge) >> >> I cannot find a method in R that will do the LR test between a glm and a >> glmer fit, so I try to do it using the liklihoods from both models >> >> #form the likelihood ratio test between the glm and glmer fits >> x2<--2*(logLik(ph.fit.2)-logLik(ph.fit.3)) >> >>> ML >> >> 79.60454 >> attr(,"nobs") >> n >> 45243 >> attr(,"nall") >> n >> 45243 >> attr(,"df") >> [1] 14 >> attr(,"REML") >> [1] FALSE >> attr(,"class") >> [1] "logLik" >> >> #Get the associated p-value >> dchisq(x2,14) >> ML >>> >>> 5.94849e-15 >> >> Which looks like an improvement in model fit to me. Am I seeing this >> correctly or are the two models even able to be compared? they are both >> estimated via maximum likelihood, so they should be, I think. >> Any help would be appreciated. >> >> Corey >> >> Corey S. Sparks, Ph.D. >> >> Assistant Professor >> Department of Demography and Organization Studies >> University of Texas San Antonio >> One UTSA Circle >> San Antonio, TX 78249 >> email:[EMAIL PROTECTED] >> web: https://rowdyspace.utsa.edu/users/ozd504/www/index.htm >> >> >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> >> > > > > Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Rune Haubo Bojesen Christensen Master Student, M.Sc. Eng. Phone: (+45) 30 26 45 54 Mail: [EMAIL PROTECTED], [EMAIL PROTECTED] DTU Informatics, Section for Statistics Technical University of Denmark, Build.321, DK-2800 Kgs. Lyngby, Denmark ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.