Re: [R] Can I use "mcnemar.test" for 3*3 tables (or is there a bug in the command?)

David Winsemius Mon, 20 Jul 2009 11:04:38 -0700


On Jul 20, 2009, at 5:22 AM, Tal Galili wrote:

Hello David Winsemius and the rest of the R help group,
David, I tried to answer your question to the best of my abilities,If I was unclear or still am leaving some things out, please help mein focusing my situation even further. here are my answers to thequestions you posed:
1) Please define "better" -
"better" is the one that is able to handle the questions at hand(marginal homogenity and symmetry) while giving meaningful resultsalthough the data is sometimes sparse (with Zeros in it) and thesample size is somewhat small (around 25 kids)

Frankly, the phrases "marginal homogeneity" and "symmetry" are, for meanyway, not particularly evocative of an interpretable sort ofdifference. I try to express my findings in terms I think my audiencemay have some chance of understanding: odds ratios or risk ratios ordifference in mean effects ...

2) And now ... define "right"
What I meant with "right" is "what test did each of these proceduresjust perform" and also "what can I learn from each of the P's ifthey where to pass the significance bar (of let's say .05)"

3) "Perhaps from the perspective of a statistically naive reviewer."
Thank you for pointing to this being superficial, I would love forany help you could give in deepening my understanding.

It appeared from context (which was snipped) that you thought one was"better" because its p-value was lower. If the criterion by which youchoose one statistical test over another is whether or not it happensto produce a signal p <0.05, then I think you are dredging rather thananalyzing. I think the question should be instead whether the test isthe most powerful for the particular hypothesis and data situation.

4) "The problem I am trying to solve" is for the following situation:

The data set:
I am analyzing a data set with subjects (kids) listening to the samemusic two times (randomized, and on different times and so on), thecondition of the experiment is a bit different the first time thekids listens (X=1) then the second time (X=2).And the response (Y) the kid is making for the experiment isrecorded as an ordered number of three levels: -1, 0, 1

So you would certainly want a test that properly handles ordinaleffects. I am not sure that was clear at all from your earlierquestions. Tests of hypotheses regarding ordered alternatives areoften more powerful than ones that evaluate less specific alternatives.

The (statistical) question: did the difference in the experimentconditions yielded different rankings from the kids? and if so, wasthere a specific direction?e.g: did kids who by now (in part one of the experiment) answeredmostly -1 and 0, now (in part two of the experiment) startedanswering more 0 and 1? Or, did kids who by now mostly answered 0now started answering -1 and 1 ? and so on.
Analyses approach:
There are two basic ways to do this.
1) The first one is a Willcox test, to see if there was change inanswers (Y) between the two situations (X=1, X=2)

I am here puzzled. Is the Willcox test a well known one in youracademic domain? If it is I apologize for my lack of breadth in namedtests. Or could you be referring to what is invoked in R withwilcox.test()? I am guessing from context that you might be askingabout the Wilcoxon signed-rank test for paired data situations. Itwould in fact address the ordering of your paired outcomes, but all ofthe Wilcoxon tests are based on the measures being from a continuousdistribution and statistical validity for your situation would bequestionable.

I would think that a proportional odds model for ordinal repeatedresponses would fit the data situation and the hypothesis of interest.You may want to search out Laura Thompson's R/S companion to Agresti'stext. She has some worked examples.

2) The second one is to produce a 3 by 3 table, with the rowsindicating what the kids answered to setting 1 of the experiment,and the columns indicating the kids answers to setting 2.
Now the question is:
was there marginal homogenity? if not, then that is an indicatorthat the general response to the experimental settings was differentfor the kids.

Can you put into natural language what you will explain to youraudience once you determine the presence or absence of "marginalhomogeneity"?

Challenges:
1) what about symmetry ?
As Peter pointed out - you can easily check that the following twomatrices have the same homogeneous margins, but only one is symmetric:
3 2 1
2 3 2
1 2 3

3 1 2
3 3 1
0 3 3
And running the two tests we have yields very interesting results(and if someone has an explanation for them, they would be greatlyappreciated):
> tt <- as.table(t(matrix(c(30,10,20,
+                           30,30 ,10,
+                           0 ,30 ,30)
+                           , ncol .... [TRUNCATED]

The truncation is most unfortunate since it results our not seeingwhat made these two calls different.


> print(tt)
   A  B  C
A 30 10 20
B 30 30 10
C  0 30 30

> mcnemar.test(tt)

        McNemar's Chi-squared test

data:  tt
McNemar's chi-squared = 40, df = 3, p-value = 1.066e-08


> mh_test(tt)

        Asymptotic Marginal-Homogeneity Test

data:  response by groups (Var1, Var2)
         stratified by block
chi-squared = 0, df = 2, p-value = 1


> tt <- as.table(t(matrix(c(30,10,20,
+                           30,30 ,10,
+                           1 ,30 ,30)
+                           , ncol .... [TRUNCATED]

> print(tt)
   A  B  C
A 30 10 20
B 30 30 10
C  1 30 30

> mcnemar.test(tt)

        McNemar's Chi-squared test

data:  tt
McNemar's chi-squared = 37.1905, df = 3, p-value = 4.194e-08


The truncation snipped off the likely sources of the differences.



> mh_test(tt)

        Asymptotic Marginal-Homogeneity Test

data:  response by groups (Var1, Var2)
         stratified by block
chi-squared = 0.0244, df = 2, p-value = 0.9879


2) what about sparsity ?

What is the correct way to handle a sparse tables that includes someZeros in them?(is filling them with 1's, in cases where the mcnemar is resultingwith NA's a legitimate strategy ?)




David, thank you for the queries and the good intentions,

I would be very happy for any help, directions, clerifications fromyou and from the other members of this wonderful discussion group.



With much gratitude,
Tal



I hopes this helped clarify

On Mon, Jul 20, 2009 at 3:20 AM, David Winsemius <dwinsem...@comcast.net> wrote:


On Jul 19, 2009, at 6:09 PM, Tal Galili wrote:

Hello Charles,
Thank you for the detail reply.

I am still left with the leading question which is: which testshould I usewhen analyzing the 3 by 3 matrix I have? The mcnemar.test or themh_test?

Is the one necessarily better then the other?

Please define "better".


(for example for
sparser matrices ?)

That does not help.



What about:
mh_test(as.table(matrix(1:16,4)))
It returns a very significant result:
chi-squared = 11.4098, df = 3, p-value = 0.009704

Where as "mcnemar.test(matrix(1:16,4))", didn't:
McNemar's chi-squared = 11.5495, df = 6, p-value = 0.0728

So which one is "right" ?

And now ... define "right".


(from the looks of it, the mh_test is doing much better)

Perhaps from the perspective of a statistically naive reviewer.

Should the strategy be to try and use both methods, and startdigging when

one doesn't sit well with the other?

I am reminded of Jim Holtam's tag line: "What problem are youtrying to solve?"





Thanks,
Tal

On Sun, Jul 19, 2009 at 10:26 PM, Charles C. Berry <cbe...@tajo.ucsd.edu>wrote:


On Sun, 19 Jul 2009, Tal Galili wrote:

Hello David,Thank you for your answer.

Do you know then what does the "mcnemar.test" do in the case of a 3*3
table
?


     print(mcnemar.test)

will show you what it does.

Because the results for the simple example I gave are ratherdifferent (P

value of 0.053 VS 0.73)

The test mcnemar.test() constructs is one of symmetry, which isequivalentto marginal homogenity in hierarchical log-linear models as I recallfrom

Bishop, Fienberg, and Holland's 1975 opus on count data.

Stuart-Maxwell uses the dispersion matrix of marginal difference.

These are two different tests. I suspect that Stuart-Maxwell is less

susceptible to continuity issues in very sparse tables, which mayaccount

for the difference you see here.

In case the mcnemar can't really handle a 3*3 matrix (or more),shouldn'tthere be an error massage for this case? (if so, who should I turnto, in

order to report this?)


Well, the code is pretty straightforward and

     mcnemar.test(matrix(1:16,4))

returns 11.5495 which is correct.

It looks like there is nothing to report. 3,1,5), ncol = 3))))


Chuck


Thanks again,
Tal





On Sun, Jul 19, 2009 at 3:47 PM, David Freedman <3.14da...@gmail.com>
wrote:


There is a function mh_test in the coin package.

library(coin)
mh_test(tt)

The documentation states, "The null hypothesis of independence ofrow andcolumn totals is tested. The corresponding test for binary factors xand

y
is known as McNemar test. For larger tables, Stuart’s W0 statistic
(Stuart,
1955, Agresti, 2002, page 422, also known as Stuart-Maxwell test) is
computed."

hth, david freedman


Tal Galili wrote:


Hello all,

I wish to perform a mcnemar test for a 3 by 3 matrix.
By running the slandered R command I am getting a result but I am not

sure

I
am getting the correct one.
Here is an example code:

(tt <-  as.table(t(matrix(c(1,4,1    ,
                         0,5,5,
                         3,1,5), ncol = 3))))
mcnemar.test(tt, correct=T)
#And I get:
     McNemar's Chi-squared test
data:  tt
McNemar's chi-squared = 7.6667, df = 3, p-value = *0.05343*


Now I was wondering if the test I just performed is the correct one.

From looking at the Wikipedia article on mcnemar (

http://en.wikipedia.org/wiki/McNemar's_test), it is said that:
"The Stuart-Maxwell
test<http://ourworld.compuserve.com/homepages/jsuebersax/mcnemar.htm>
is

different generalization of the McNemar test, used for testingmarginal

homogeneity in a square table with more than two rows/columns"

From searching for a Stuart-Maxwell

test<http://ourworld.compuserve.com/homepages/jsuebersax/mcnemar.htm>
in
google, I found an algorithm here:


http://www.m-hikari.com/ams/ams-password-2009/ams-password9-12-2009/abbasiAMS9-12-2009.pdf


From running this algorithm I am getting a different P value, here is
the

(somewhat ugly) code I produced for this:
get.d <- function(xx)
{
length1 <- dim(xx)[1]
ret1 <- margin.table(xx,1) - margin.table(xx,2)
return(ret1)
}

get.s <- function(xx)
{
the.s <- xx
for( i in 1:dim(xx)[1])
{
 for(j in 1:dim(xx)[2])
 {
   if(i == j)
   {
     the.s[i,j] <- margin.table(xx,1)[i] + margin.table(xx,2)[i] -
2*xx[i,i]
   } else {
     the.s[i,j] <- -(xx[i,j] + xx[j,i])
   }
 }
}
return(the.s)
}

chi.statistic <- t(get.d(tt)[-3]) %*% solve(get.s(tt)[-3,-3])  %*%
get.d(tt)[-3]
paste("the P value:", pchisq(chi.statistic, 2))

#and the result was:
"the P value: 0.268384371053358"



So to summarize my questions:
1) can I use "mcnemar.test" for 3*3 (or more) tables ?
2) if so, what test is being performed (
Stuart-Maxwell<

http://ourworld.compuserve.com/homepages/jsuebersax/mcnemar.htm>)

?
3) Do you have a recommended link to an explanation of the algorithm
employed?


Thanks,
Tal

snipped various sigs



My contact information:
Tal Galili
Phone number: 972-50-3373767
FaceBook: Tal Galili
My Blogs:
http://www.r-statistics.com/
http://www.talgalili.com
http://www.biostatistics.co.il


David Winsemius, MD
Heritage Laboratories
West Hartford, CT

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Can I use "mcnemar.test" for 3*3 tables (or is there a bug in the command?)

Reply via email to