I don't know how the hclust function is implemented, but generally in
hierarchical clustering the result can be ambiguous if there are several
distances of identical value in the dataset (or identical between-cluster
distances occur when aggregating clusters). The role of the order of the
data depends on how these ambiguities are resolved. It may well be that in
such cases if at some point when building the hierarchy there are two
different possibilities to merge clusters at the same distance value what
is done by hclust is determined by the order.
Hope this helps,
Christian
On Mon, 15 Nov 2010, rchowdhury wrote:
Hello,
I am using the hclust function to cluster some data. I have two separate
files with the same data. The only difference is the order of the data in
the file. For some reason, when I run the two files through the hclust
function, I get two completely different results.
Does anyone know why this is happening? Does the order of the data matter?
Thanks,
RC
--
View this message in context:
http://r.789695.n4.nabble.com/hclust-does-order-of-data-matter-tp3043896p3043896.html
Sent from the R help mailing list archive at Nabble.com.
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
*** --- ***
Christian Hennig
University College London, Department of Statistical Science
Gower St., London WC1E 6BT, phone +44 207 679 1698
chr...@stats.ucl.ac.uk, www.homepages.ucl.ac.uk/~ucakche
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.