Hello All... I gave a task to my students that involved using mclust to look for clusters in some bivariate data of isotopes vs various mining locations. They discovered something I didnĀ¹t expect; the data (called tur) is appended below.
p <- qplot(x = dD, y = dCu65, data = tur, color = mine) print(p) # simple bivariate plot of the data; looks fine mod1 <- Mclust(tur[,2:3]) mod1$G mod2 <- Mclust(tur[,3:2]) mod2$G One can use coordProj to see the clusters found, but the basic result is that mclust found 3 clusters in mod1, but if you interchange the x and y columns it finds 7 clusters (mod2). Since this is bivariate data, I find this result pretty strange. The only thing I can think of is that it has something to do with how the algorithm is seeded, but that isn't too convincing. Since I couldn't believe my eyes, I made up some bivariate data with known clusters (not given here) and interchanged the x and y columns and got the same clusters. I'm at a loss. Hopefully someone can set my understanding on the right path. Thanks for any insight! Bryan > sessionInfo() R version 2.10.1 (2009-12-14) x86_64-apple-darwin9.8.0 locale: [1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8 attached base packages: [1] datasets tools grid graphics grDevices utils stats methods base other attached packages: [1] faraway_1.0.4 GGally_0.1 xtable_1.5-6 mvbutils_2.5.1 ggplot2_0.8.7 [6] digest_0.4.2 reshape_0.8.3 proto_0.3-8 ChemoSpec_1.43 R.utils_1.4.0 [11] R.oo_1.7.1 R.methodsS3_1.2.0 rgl_0.91 lattice_0.18-3 mvoutlier_1.4 [16] plyr_0.1.9 RColorBrewer_1.0-2 chemometrics_0.8 som_0.3-5 robustbase_0.5-0-1 [21] rpart_3.1-46 pls_2.1-0 pcaPP_1.8 mvtnorm_0.9-9 nnet_7.3-1 [26] mclust_3.4.4 MASS_7.3-5 lars_0.9-7 e1071_1.5-23 class_7.3-2 And the data: > dput(tur) structure(list(mine = structure(c(13L, 13L, 13L, 13L, 13L, 13L, 13L, 13L, 13L, 13L, 13L, 13L, 13L, 13L, 13L, 13L, 13L, 13L, 13L, 13L, 13L, 13L, 13L, 13L, 13L, 13L, 13L, 13L, 13L, 13L, 13L, 13L, 6L, 6L, 6L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 14L, 14L, 14L, 14L, 14L, 14L, 14L, 14L, 9L, 9L, 9L, 9L, 12L, 12L, 12L, 12L, 12L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 4L, 4L, 4L, 4L, 4L, 11L, 11L, 11L, 11L, 11L, 11L, 11L, 3L, 3L, 3L, 3L, 3L, 3L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 5L, 5L, 5L, 5L, 5L, 5L), .Label = c("Cas", "CH", "CR", "FM", "GU", "KG", "KM", "LV", "MC", "MZ", "N8", "OR", "SB", "TF"), class = "factor"), dCu65 = c(6.1, 6.5, 5.7, 6.4, 6.1, 5.9, 6.5, 5.7, 6, 6.4, 6.8, 6.9, 6.7, 6.7, 5.8, 5.3, 6.7, 5.7, 6.7, 5.6, 6.4, 6.4, 6.5, 7.6, 6.7, 6.4, 6.4, 7, 7, 5.9, 5.3, 6.1, 11.8, 12.1, 12.4, 1, 1.8, 1.4, 1.2, 1.2, 1.2, 1.1, 1.2, 1.8, 1, 0.8, 0.7, 1.2, 1.5, 1, 0.2, 1.4, 0.3, 1.1, 0, -0.5, -1.4, -0.2, 1.1, 0.9, 0.1, 1.4, -1.2, -0.1, -0.4, -0.8, 3, 3.3, 3.9, 3.9, 3.7, 9.5, 9.6, 9.6, 8.3, 10.3, 8.4, 9.4, 0.8, 0.9, 1.3, 0.2, 2.2, 0.7, 1.8, 4.5, 4.5, 4.6, 4.5, 4.6, 3.9, 4.2, 4.9, 4.9, 4.5, 16.7, 17.4, 17.5, 17.5, 17.4, 1.9, 2.4, 1.7, 1.2, 1.5, 1.1, 1.8, 3.1, 3.1, 3, 4.1, 4.2, 2.7, 4.5, 0.4, 1.2, 3, 3.5, 1.2, 0.8, 0.9, 2, -0.8, 5.8, 0.6, 1, 0.9, 4.4, 2.3, -0.1), dD = c(-66L, -81L, -81L, -71L, -68L, -79L, -73L, -77L, -74L, -71L, -73L, -80L, -76L, -80L, -74L, -77L, -74L, -77L, -74L, -82L, -79L, -72L, -74L, -71L, -76L, -72L, -76L, -73L, -75L, -78L, -81L, -77L, -77L, -73L, -79L, -104L, -98L, -98L, -109L, -106L, -99L, -107L, -100L, -98L, -95L, -95L, -95L, -97L, -98L, -121L, -119L, -118L, -121L, -121L, -106L, -116L, -122L, -112L, -125L, -109L, -114L, -118L, -102L, -109L, -101L, -101L, -74L, -78L, -63L, -67L, -75L, -65L, -60L, -62L, -67L, -59L, -62L, -68L, -61L, -58L, -69L, -57L, -59L, -67L, -73L, -81L, -90L, -87L, -82L, -80L, -80L, -81L, -83L, -90L, -89L, -107L, -122L, -128L, -113L, -124L, -93L, -89L, -92L, -85L, -79L, -84L, -83L, -148L, -142L, -127L, -130L, -123L, -131L, -85L, -87L, -90L, -81L, -65L, -81L, -119L, -121L, -94L, -90L, -117L, -95L, -69L, -122L, -103L, -85L, -106L)), .Names = c("mine", "dCu65", "dD"), class = "data.frame", row.names = c(NA, -130L)) ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.