Dear Tal,

I took the definition of the Hubert gamma- and Dunn-index from the Gordon book. They are actually not about comparing two clusters, at least not in that reference, and they require dissimilarities.

The adjusted Rand index and Meila's VI, as implemented in cluster.stats, compare two clusterings. If you set compareonly=TRUE in cluster.stats, it only computes these two indexes, so it doesn't need the dissimilarity matrix in principle. I will probably in the next update
change it so that in this case you don't need to provide a
dissimilarity matrix.

Until then, you can supply a noninformative matrix.
Example:
c1 <- sample(4,100,replace=TRUE)
c2 <- sample(5,100,replace=TRUE)
cs <- cluster.stats(d=matrix(0,ncol=100,nrow=100),c1,c2,compareonly=TRUE)

cs$corrected.rand
cs$vi

Hope this helps,
Christian



On Wed, 21 Apr 2010, Tal Galili wrote:

Thanks for the fast reply Uwe.

My hope in posting this was to find if anyone had already done work (in R)
in this direction.  So far I wasn't able to find any such relevant code, so
I turned to the mailing list.

Regarding new implementations - thanks for offering! - I have already came
around one such algorithm - I implemented it, and will probably publish it
on my blog <http://www.r-statistics.com/> in the near future.

If any one else has any reference to R implementation, it would be most
helpful,
Tal


----------------Contact
Details:-------------------------------------------------------
Contact me: tal.gal...@gmail.com |  972-52-7275845
Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) |
www.r-statistics.com (English)
----------------------------------------------------------------------------------------------




2010/4/21 Uwe Ligges <lig...@statistik.tu-dortmund.de>

On 21.04.2010 18:15, Tal Galili wrote:

Hello all,

I would like to compare the similarity of two cluster solutions using a
validation criteria (such as Hubert's gamma coefficient, the Dunn index
the
corrected rand index and so on)

I see (from here:http://www.statmethods.net/advstats/cluster.html) that
the function cluster.stats() in the fpc package provides a mechanism
for comparing 2 cluster solutions - *BUT* - it requires me to give the
the distance matrix among objects.

*My question *is: What ways can you suggest for comparing two cluster
solutions, while using the cluster indicators only (i.e: a vector saying
to
which cluster each object belongs to), and WITHOUT asking to submit the
distance matrix between the objects.


Don't know. If you have a theoretical solution and can provide the
description of a method, there will be many people around happy to make an
algorithm and implement it.

Uwe Ligges



 Thanks,
Tal



----------------Contact
Details:-------------------------------------------------------
Contact me: tal.gal...@gmail.com |  972-52-7275845
Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) |
www.r-statistics.com (English)

----------------------------------------------------------------------------------------------

       [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


*** --- ***
Christian Hennig
University College London, Department of Statistical Science
Gower St., London WC1E 6BT, phone +44 207 679 1698
chr...@stats.ucl.ac.uk, www.homepages.ucl.ac.uk/~ucakche

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to