On 2/3/2008 10:09 AM, Christoph Mathys wrote: > Dear R users, > > I have a linear model of the kind > > outcome ~ treatment + covariate > > where 'treatment' is a factor with three levels ("0", "1", and "2"), > and the covariate is continuous. Treatments "1" and "2" both have > regression coefficients significantly different from 0 when using > treatment contrasts with treatment "0" as the baseline. I would now like > to determine effect sizes (akin to Cohen's d in a two-sample comparison) > for the comparison to baseline of treatments "1" and "2". I have > illustrated a way to do this in the reproducible example below and am > grateful for any comments on the soundness of what I'm doing. I have not > yet found a way to determine confidence intervals for the effect sizes > derived below and would appreciate tips on that. > > set.seed(123456) # Make session reproducible > > # Set up the treatment factor with three levels and 100 observations > # each > treatment <- factor(c(rep(0, 100), rep(1, 100), rep(2, 100))) > > # Simulate outcomes > outcome <- rep(NA, 300) > outcome[treatment==0] <- rnorm(100, 10, 5) # baseline: mean=10, sd=5 > outcome[treatment==1] <- rnorm(100, 30, 5) # effect size 4 > outcome[treatment==2] <- rnorm(100, 40, 5) # effect size 6 > > # Check effect sizes (Cohen's d) > cohens.d <- function (x, y) {(mean(x)-mean(y))/sqrt((var(x)+var(y))/2) } > cohens.d(outcome[treatment==1], outcome[treatment==0]) > [1] 3.984774 > cohens.d(outcome[treatment==2], outcome[treatment==0]) > [1] 6.167798 > > # Sometimes standardized regression coefficients are recommended > # for determining effect size but that clearly doesn't work here: > coef(lm(scale(outcome) ~ treatment)) > (Intercept) treatment1 treatment2 > -1.233366 1.453152 2.246946 > # The reason it doesn't work is that the difference of outcome > # means is divided by the sd of *all* outcomes: > (mean(outcome[treatment==1])-mean(outcome[treatment==0]))/sd(outcome) > [1] 1.453152 > (mean(outcome[treatment==2])-mean(outcome[treatment==0]))/sd(outcome) > [1] 2.246946
How about scaling the outcome by the residual standard error from the unstandardized model? For example: library(MASS) cmat <- diag(3) diag(cmat) <- c(25,25,25) df <- data.frame(Y = c(mvrnorm(100, mu=c(10,30,40), Sigma=cmat, empirical=TRUE)), TX = factor(rep(c(0,1,2), each=100))) fm1 <- lm(Y ~ TX, data = df) fm2 <- lm(scale(Y, scale=summary(fm1)$sigma) ~ TX, data = df) summary(fm1) Call: lm(formula = Y ~ TX, data = df) Residuals: Min 1Q Median 3Q Max -12.7260 -3.5215 -0.1982 3.4517 12.1690 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 10.0000 0.5000 20.00 <2e-16 *** TX1 20.0000 0.7071 28.28 <2e-16 *** TX2 30.0000 0.7071 42.43 <2e-16 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 5 on 297 degrees of freedom Multiple R-squared: 0.8627, Adjusted R-squared: 0.8618 F-statistic: 933.3 on 2 and 297 DF, p-value: < 2.2e-16 summary(fm2) Call: lm(formula = scale(Y, scale = summary(fm1)$sigma) ~ TX, data = df) Residuals: Min 1Q Median 3Q Max -2.54521 -0.70431 -0.03965 0.69034 2.43380 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) -3.3333 0.1000 -33.33 <2e-16 *** TX1 4.0000 0.1414 28.28 <2e-16 *** TX2 6.0000 0.1414 42.43 <2e-16 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 1 on 297 degrees of freedom Multiple R-squared: 0.8627, Adjusted R-squared: 0.8618 F-statistic: 933.3 on 2 and 297 DF, p-value: < 2.2e-16 confint(fm2) 2.5 % 97.5 % (Intercept) -3.530132 -3.136535 TX1 3.721685 4.278315 TX2 5.721685 6.278315 I've never seen this approach before, and I'm not how it would work when the variances within groups are heterogeneous or one or more covariates are added to the model. hope this helps, Chuck > # Now, create a situation where Cohen's d is impossible to > # calculate directly by introducing a continuous covariate: > covariate <- 1:300 > outcome <- outcome + rnorm(300, covariate, 2) > model <- lm(scale(outcome) ~ treatment + scale(covariate)) > coef(model) > (Intercept) treatment1 treatment2 scale(covariate) > -0.1720456 0.1994251 0.3167116 0.8753761 > > # Proposed way to determine effect size: simulate outcomes for each > # treatment level assuming the covariate to have a fixed value (here > # its mean value after standardization: zero) > library(MCMCpack) > no.of.sims <- 10000 > sims.model <- MCMCregress(model, mcmc=no.of.sims) > sims.model[1:2,] > (Intercept) treatment1 treatment2 scale(covariate) sigma2 > [1,] -0.1780735 0.2024111 0.3395233 0.8682119 0.002617449 > [2,] -0.1521623 0.1773623 0.2956053 0.8764573 0.003529013 > sims.treat0 <- rnorm(no.of.sims, sims.model[,"(Intercept)"], > sqrt(sims.model[,"sigma2"])) > sims.treat1 <- rnorm(no.of.sims, sims.model[,"(Intercept)"] + > sims.model[,"treatment1"], sqrt(sims.model[,"sigma2"])) > sims.treat2 <- rnorm(no.of.sims, sims.model[,"(Intercept)"] + > sims.model[,"treatment2"], sqrt(sims.model[,"sigma2"])) > > # Calculate Cohen's d for simulated values > cohens.d(sims.treat1, sims.treat0) > [1] 3.683093 > cohens.d(sims.treat2, sims.treat0) > [1] 5.782622 > > These values are reasonably close to the ones (4 and 6) I plugged in at > the beginning. It would be even nicer to have a confidence interval for > them, but if I bootstrap one out of the simulated outcomes its width > depends on the number of simulations and is therefore arbitrary. If > anyone knew a better way to get at the effect sizes I'm looking for or > how I could also get confidence intervals for them, that would be > greatly appreciated. > > Thanks, > > Christoph > > -- > Christoph Mathys, M.S. > Music and Neuroimaging Laboratory > Beth Israel Deaconess Medical Center > and Harvard Medical School > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. -- Chuck Cleland, Ph.D. NDRI, Inc. 71 West 23rd Street, 8th floor New York, NY 10010 tel: (212) 845-4495 (Tu, Th) tel: (732) 512-0171 (M, W, F) fax: (917) 438-0894 ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.