Is Pearson's Chi-Square test for contingency tables asymptotically unbiased for large tables (large degrees of freedom) regardless of the expected values in each cell? The rule of thumb is that Pearson's Chi-square should not be used when large numbers of cells have expected values < 5. However, I compared the results on 4x4 contingency tables for R's chisq.test using chi-square approximation vs. chisq.test using a large number of monte carlo simulations, and the results agree within a fairly small error. This is true even when every cell of the table has an expected value < 2. I tried several tables, but the best example was:
4 1 1 1 1 4 1 1 1 1 4 1 1 1 1 4 As expected, the chi-square approximation appears to be very poor when both the expected values and degrees of freedom are small. Is there a good theoretical reason why the chi-square test seems to perform well on large contingency tables even with small expected values? Are the standard rules of thumb overly simplistic? -- View this message in context: http://www.nabble.com/Validity-of-Pearson%27s-Chi-Square-for-Large-Tables-tp23844791p23844791.html Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.