[R] Validity of Pearson's Chi-Square for Large Tables

dsimcha Tue, 02 Jun 2009 21:59:08 -0700

Is Pearson's Chi-Square test for contingency tables asymptotically unbiased
for large tables (large degrees of freedom) regardless of the expected
values in each cell?  The rule of thumb is that Pearson's Chi-square should
not be used when large numbers of cells have expected values < 5.  However,
I compared the results on 4x4 contingency tables for R's chisq.test using
chi-square approximation vs. chisq.test using a large number of monte carlo
simulations, and the results agree within a fairly small error.  This is
true even when every cell of the table has an expected value < 2.  I tried
several tables, but the best example was:


4  1  1  1
1  4  1  1
1  1  4  1
1  1  1  4

As expected, the chi-square approximation appears to be very poor when both
the expected values and degrees of freedom are small.  Is there a good
theoretical reason why the chi-square test seems to perform well on large
contingency tables even with small expected values?  Are the standard rules
of thumb overly simplistic?
-- 
View this message in context: 
http://www.nabble.com/Validity-of-Pearson%27s-Chi-Square-for-Large-Tables-tp23844791p23844791.html
Sent from the R help mailing list archive at Nabble.com.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Validity of Pearson's Chi-Square for Large Tables

Reply via email to