Your solution was educational. Thank you. I have two comments.
1) If you do not provide both options then you are forcing people to conform to 
your approach. In general I disapprove, but for specific cases I can see 
advantages.
2) Without reading the relevant papers (and possibly understanding them) is 
there a simple metric that would enable the correct choice between 
Welch-Shatterthwaite and Welch (1947)?
3) If there is a broad consensus that Welch (1947) is never the correct option 
then do not implement it.

As written, it sounds like Welch (1938) proposed a correction. Welch published 
another correction in 1947, but then retracted his 1947 correction in a 1949 
paper. At least that is how I interpret what was written in your option c.

Tim
-----Original Message-----
From: R-help <r-help-boun...@r-project.org> On Behalf Of Dr. Rainer Düsing
Sent: Monday, November 27, 2023 7:42 AM
To: r-help@r-project.org
Subject: [R] t.test with Welch correction is ambiguous

[External Email]

Dear R Team!



There was an ongoing debate on Research Gate about the "Welch" option in your 
base R t.test command. A user noticed that the correction of the degrees of 
freedom labeled as "Welch Two Sample t-test", if you choose var.equal = TRUE in 
the R t.test command, differs from the output of the Stata analysis, which is 
also labeled as "Welch's degrees of freedom".  Confusingly enough, the R output 
coincided with the Stata result labeled as "Satterthwaite's degrees of 
freedom". Unfortunately, the R documentation wasn't clear either, since it 
lacks any specific reference and the formulation is
ambiguous: "If TRUE then the pooled variance is used to estimate the variance 
otherwise the Welch (or Satterthwaite) approximation to the degrees of freedom 
is used." It rather sounds as if both options are available and not that both 
authors proposed the same correction separately.

After doing some research and looking into the R code, we found a solution and 
would like to suggest an update to the R documentation, to make it more clear 
(you can find the similar proposal to the Stata list here:
https://www.statalist.org/forums/forum/general-stata-discussion/general/1734987-unequal-vs-welch-options-for-ttest-why-no-mention-of-welch-1938-in-the-documentation
)

What is called "Welch Two Sample t-test" in the t.test command refers to two 
publications (see links below) with the same correction, namely Welch
(1938) and Satterthwaite (1946). Hence, you also find "Welch-Satterthwaite"
correction as a description in the literature for this (which is the 
aforementioned "Satterthwaite's degrees of freedom" correction in Stata).
But there is also another correction proposed by Welch (1947), which has 
slightly different denominators (see code below), which is called "Welch's 
degrees of freedom" in Stata. This option is not available in R so far.

Therefore, we suggest a) to cite the appropriate references in the 
documentation (at least Welch (1938) and Satterthwaite (1946)), b) adapt the 
output to something like "Welch-Satterthwaite adjusted Two Sample t-test" and 
maybe c) to incorporate the third option for the Welch (1947) adjustment, where 
the Welch-Satterthwaite correction should be the default option (Aspin & Welch, 
1949). Code proposal below for the df correction.

Best wishes,
Rainer Düsing



   1. ·  https://www.jstor.org/stable/2332010
   
<https://www.researchgate.net/deref/https%3A%2F%2Fwww.jstor.org%2Fstable%2F2332010?_tp=eyJjb250ZXh0Ijp7ImZpcnN0UGFnZSI6InF1ZXN0aW9uIiwicGFnZSI6InF1ZXN0aW9uIiwicG9zaXRpb24iOiJwYWdlQ29udGVudCJ9fQ>
   (Welch, 1938)
   2. ·  https://www.jstor.org/stable/3002019
   
<https://www.researchgate.net/deref/https%3A%2F%2Fwww.jstor.org%2Fstable%2F3002019?_tp=eyJjb250ZXh0Ijp7ImZpcnN0UGFnZSI6InF1ZXN0aW9uIiwicGFnZSI6InF1ZXN0aW9uIiwicG9zaXRpb24iOiJwYWdlQ29udGVudCJ9fQ>
   (Satterthwaite, 1946)
   3. ·  https://www.jstor.org/stable/2332510
   
<https://www.researchgate.net/deref/https%3A%2F%2Fwww.jstor.org%2Fstable%2F2332510?_tp=eyJjb250ZXh0Ijp7ImZpcnN0UGFnZSI6InF1ZXN0aW9uIiwicGFnZSI6InF1ZXN0aW9uIiwicG9zaXRpb24iOiJwYWdlQ29udGVudCJ9fQ>
   (Welch, 1947)
   4. ·  Aspin, Alice A., and B. L. Welch. "Tables for Use in Comparisons
   Whose Accuracy Involves Two Variances, Separately Estimated."
   *Biometrika* 36, no. 3/4 (1949): 290-96. https://doi.org/10.2307/2332668
   
<https://www.researchgate.net/deref/https%3A%2F%2Fdoi.org%2F10.2307%2F2332668?_tp=eyJjb250ZXh0Ijp7ImZpcnN0UGFnZSI6InF1ZXN0aW9uIiwicGFnZSI6InF1ZXN0aW9uIiwicG9zaXRpb24iOiJwYWdlQ29udGVudCJ9fQ>.
   [see point 4 in the Appendix by Welch]



var.equal = "yes"

var.equal = "Welch"

var.equal = "W-S"



vx <- var(x)

nx <- length(x)

vy <- var(y)

ny <- length(y)



if (var.equal == "yes") {

  df <- nx + ny - 2

  v <- 0

  if (nx > 1)

    v <- v + (nx - 1) * vx

  if (ny > 1)

    v <- v + (ny - 1) * vy

  v <- v/df

  stderr <- sqrt(v * (1/nx + 1/ny))

} else if (var.equal == "Welch") {

  stderrx <- sqrt(vx/nx)

  stderry <- sqrt(vy/ny)

  stderr <- sqrt(stderrx^2 + stderry^2)

  df <- -2+(stderr^4/(stderrx^4/(nx + 1) + stderry^4/(ny +1)))

} else {

  stderrx <- sqrt(vx/nx)

  stderry <- sqrt(vy/ny)

  stderr <- sqrt(stderrx^2 + stderry^2)

  df <- stderr^4/(stderrx^4/(nx - 1) + stderry^4/(ny -1))

}


--
*Dr. rer. nat. Rainer Düsing, Dipl.-Psych. * Universität Osnabrück Institut für 
Psychologie Fachgebiet Forschungsmethodik, Diagnostik und Evaluation 
Lise-Meitner-Str. 3
49076 Osnabrück

Raum 75/222
Tel: +49-541 969 7734
Email: radues...@uos.de <rdues...@uos.de>

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.r-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to