Dear list,

I'm a bit perplexed why the 95% confidence bands for the predicted 
probabilities for units where x=0 and x=1 overlap in the following instance.

I've simulated binary data to which I've then fitted a simple logistic 
regression model, with one covariate, and the coefficient on x is statistically 
significant at the 0.05 level. I've then used two different methods to generate 
95% confidence bands for predicted probabilities, for each of two possible 
values of x. Given the result of the model, I expected the bands not to 
overlap… but they do.

Can anyone please explain where I've gone wrong with the following code, OR why 
it makes sense for the bands to overlap, despite the statistically significant 
beta coefficient? There may be a good statistical reason for this, but I'm not 
aware of it.

Many thanks,
Malcolm Fairbrother


n <- 120
set.seed(030512)
x <- rbinom(n, 1, 0.5)
dat <- within(data.frame(x), ybe <- rbinom(n, 1, plogis(-0.5 + x)))
mod1 <- glm(ybe ~ x, dat, family=binomial)
summary(mod1) # coefficient on x is statistically significant at the 0.05 
level… almost at the 0.01 level

pred <- predict(mod1, newdata=data.frame(x=c(0,1)), se.fit=T)
with(pred, cbind(low = plogis(fit - 1.96*se.fit), est = plogis(fit), up = 
plogis(fit + 1.96*se.fit)))  # confidence bands based on SEs

# simulation-based confidence bands:
sims <- t(replicate(200, coef(glm(simulate(mod1)$sim_1 ~ x, data=dat, 
family=binomial))))
pred0 <- plogis(quantile(sims%*%c(1,0), c(0.025, 0.5, 0.975)))
pred1 <- plogis(quantile(sims%*%c(1,1), c(0.025, 0.5, 0.975)))
rbind(pred0, pred1)

# the upper bound of the prediction for x=0 is greater than the lower bound of 
the prediction for x=1, using both methods

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to