I know that you didn't ask for this but to me this seems to be a very dodgy method to select a "best number of clusters" with no proper basis at all. All of these tests are data dependent, so the p-values cannot be interpreted in the usual way. It is actually not clear how they can be interpreted, and the freedom in the data to find a clustering depends on the number of clusters, so there is no reason to expect that comparing p-values for different numbers tells you anything meaningful. Do you really think that it is an informative difference if one clustering gives
you p=10^{-58} and another one 10^{-30}?

Christian

On Thu, 17 Dec 2009, Søren Faurby wrote:

In an effort to select the most appropriate number of clusters in a
mixture analysis I am comparing the expected and actual membership of
individuals in various clusters using the Fisher?s exact test. I aim
for the model with the lowest possible p-value, but I frequently get
p-values below 2.2e-16 and therefore does not get exact p-values with
standard Fisher?s exact tests in R.

Does anybody know if there is a version of Fisher?s exact test in
any package which can handle lower probabilities, or have other suggestions as to how I can compare the probabilities?

I am for instance comparing the following two:

dat2<-matrix(c(29,0,29,0,12,0,18,0,0,29,0,16,0,19), nrow=2)
fisher.test(dat2, workspace=30000000)

dat3<-matrix(c(29,0,0,29,0,0,12,0,0,17,0,1,0,29,0,0,15,1,0,0,19),
nrow=3)
fisher.test(dat3, workspace=30000000)

Which both result in p-value < 2.2e-16

Kind regards, Søren

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

*** --- ***
Christian Hennig
University College London, Department of Statistical Science
Gower St., London WC1E 6BT, phone +44 207 679 1698
chr...@stats.ucl.ac.uk, www.homepages.ucl.ac.uk/~ucakche
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to