Hello all,
due to unexplained differences between statistical results from
collaborators and our lab that arose in the same large proteomics
dataset we reevaluated the t.test() function. Here, we found a
weird behaviour that is also reproducible in the following small
test dataset:
Suppose, we have two vectors with numbers and some missing values
that refer to the same individuals and that should therefore be
evaluated with a paired t-test:
testdata.A <- c(1.15, -0.2, NA, 1, -2, -0.5, 0.1, 1.2, -1.4, 0.01);
testdata.B <- c(1.2, 1.1, 3, -0.1, 3, 1.1, 0, 1.3, 4, NA);
Then
print(t.test(testdata.A, testdata.B, paired=TRUE,
alternative="two.sided", na.action="na.pass"))
and
print(t.test(testdata.A, testdata.B, paired=TRUE,
alternative="two.sided", na.action="na.exclude"))
deliver the same p value (0.1162, identical to Excel's result).
However, after combining the two vectors with
testdata <- c(testdata.A, testdata.B);
and defining a criterion vector with
criterion <- c(0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1);
(that is the type of data layout we have in our proteomics project)
we get a different p-value (0.01453) with
print(t.test(testdata ~ criterion, paired=TRUE,
alternative="two.sided", na.action="na.exclude")) .
The statement
print(t.test(testdata ~ criterion, paired=TRUE,
alternative="two.sided", na.action="na.pass"))
however, delivers a p-value of 0.1162 again.
With
print(t.test(testdata[criterion==0], testdata[criterion==1],
paired=TRUE, alternative="two.sided", na.action="na.exclude"))
that imitates the first form, we get again a p value of 0.1162.
What is the reason for the different p values? Should not all calls
to t.test that exlude missing values be equivalent and therefore
deliver the same results?
Excel, StatView and KaleidaGraph all display a p-value of 0.1162.
J. W. D.
--
-- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
-- Dr. Johannes W. Dietrich, M.D.
-- Laboratory XU44, Endocrine Research
-- Medical Hospital I, Bergmannsheil University Hospitals
-- Ruhr University of Bochum
-- Buerkle-de-la-Camp-Platz 1, D-44789 Bochum, NRW, Germany
-- Phone: +49:234:302-6400, Fax: +49:234:302-6403
-- eMail: "j.w.dietr...@medical-cybernetics.de"
-- WWW: http://medical-cybernetics.de
-- WWW: http://www.bergmannsheil.de
-- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.