Dear R users, I have a linear model of the kind
outcome ~ treatment + covariate where 'treatment' is a factor with three levels ("0", "1", and "2"), and the covariate is continuous. Treatments "1" and "2" both have regression coefficients significantly different from 0 when using treatment contrasts with treatment "0" as the baseline. I would now like to determine effect sizes (akin to Cohen's d in a two-sample comparison) for the comparison to baseline of treatments "1" and "2". I have illustrated a way to do this in the reproducible example below and am grateful for any comments on the soundness of what I'm doing. I have not yet found a way to determine confidence intervals for the effect sizes derived below and would appreciate tips on that. set.seed(123456) # Make session reproducible # Set up the treatment factor with three levels and 100 observations # each treatment <- factor(c(rep(0, 100), rep(1, 100), rep(2, 100))) # Simulate outcomes outcome <- rep(NA, 300) outcome[treatment==0] <- rnorm(100, 10, 5) # baseline: mean=10, sd=5 outcome[treatment==1] <- rnorm(100, 30, 5) # effect size 4 outcome[treatment==2] <- rnorm(100, 40, 5) # effect size 6 # Check effect sizes (Cohen's d) cohens.d <- function (x, y) {(mean(x)-mean(y))/sqrt((var(x)+var(y))/2) } cohens.d(outcome[treatment==1], outcome[treatment==0]) [1] 3.984774 cohens.d(outcome[treatment==2], outcome[treatment==0]) [1] 6.167798 # Sometimes standardized regression coefficients are recommended # for determining effect size but that clearly doesn't work here: coef(lm(scale(outcome) ~ treatment)) (Intercept) treatment1 treatment2 -1.233366 1.453152 2.246946 # The reason it doesn't work is that the difference of outcome # means is divided by the sd of *all* outcomes: (mean(outcome[treatment==1])-mean(outcome[treatment==0]))/sd(outcome) [1] 1.453152 (mean(outcome[treatment==2])-mean(outcome[treatment==0]))/sd(outcome) [1] 2.246946 # Now, create a situation where Cohen's d is impossible to # calculate directly by introducing a continuous covariate: covariate <- 1:300 outcome <- outcome + rnorm(300, covariate, 2) model <- lm(scale(outcome) ~ treatment + scale(covariate)) coef(model) (Intercept) treatment1 treatment2 scale(covariate) -0.1720456 0.1994251 0.3167116 0.8753761 # Proposed way to determine effect size: simulate outcomes for each # treatment level assuming the covariate to have a fixed value (here # its mean value after standardization: zero) library(MCMCpack) no.of.sims <- 10000 sims.model <- MCMCregress(model, mcmc=no.of.sims) sims.model[1:2,] (Intercept) treatment1 treatment2 scale(covariate) sigma2 [1,] -0.1780735 0.2024111 0.3395233 0.8682119 0.002617449 [2,] -0.1521623 0.1773623 0.2956053 0.8764573 0.003529013 sims.treat0 <- rnorm(no.of.sims, sims.model[,"(Intercept)"], sqrt(sims.model[,"sigma2"])) sims.treat1 <- rnorm(no.of.sims, sims.model[,"(Intercept)"] + sims.model[,"treatment1"], sqrt(sims.model[,"sigma2"])) sims.treat2 <- rnorm(no.of.sims, sims.model[,"(Intercept)"] + sims.model[,"treatment2"], sqrt(sims.model[,"sigma2"])) # Calculate Cohen's d for simulated values cohens.d(sims.treat1, sims.treat0) [1] 3.683093 cohens.d(sims.treat2, sims.treat0) [1] 5.782622 These values are reasonably close to the ones (4 and 6) I plugged in at the beginning. It would be even nicer to have a confidence interval for them, but if I bootstrap one out of the simulated outcomes its width depends on the number of simulations and is therefore arbitrary. If anyone knew a better way to get at the effect sizes I'm looking for or how I could also get confidence intervals for them, that would be greatly appreciated. Thanks, Christoph -- Christoph Mathys, M.S. Music and Neuroimaging Laboratory Beth Israel Deaconess Medical Center and Harvard Medical School ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.