>>>>> Gabor Grothendieck <ggrothendi...@gmail.com> >>>>> on Fri, 2 Jul 2010 18:50:28 -0400 writes:
> In kmeans() in stats one gets an error message with the default > clustering algorithm if centers = 1. Its often useful to calculate > the sum of squares for 1 cluster, 2 clusters, etc. and this error > complicates things since one has to treat 1 cluster as a special case. > A second reason is that easily getting the 1 cluster sum of squares > makes it easy to calculate the between cluster sum of squares when > there is more than 1 cluster. > I suggest adding the line marked ### to the source code of kmeans (the > other lines shown are just ther to illustrate context). Adding this > line forces kmeans to use the code for algorithm 3 if centers is 1. > This is useful since unlike the code for the default algorithm, the > code for algorithm 3 succeeds for centers = 1. > if(length(centers) == 1) { > if (centers == 1) nmeth <- 3 ### > k <- centers I agree that this is a reasonable improvement, and have applied this (+ docu + example) to the R-devel sources. Thank you, Gabor. > Also note that KMeans in Rcmdr produces a betweenss and a tot.withinss > and it would be nice if kmeans in stats did that too: Well, patches (to the R-devel *sources*) are happily accepted Martin >> library(Rcmdr) >> str(KMeans(USArrests, 3)) > List of 6 > $ cluster : Named int [1:50] 1 1 1 2 1 2 3 1 1 2 ... > ..- attr(*, "names")= chr [1:50] "Alabama" "Alaska" "Arizona" "Arkansas" ... > $ centers : num [1:3, 1:4] 11.81 8.21 4.27 272.56 173.29 ... > ..- attr(*, "dimnames")=List of 2 > .. ..$ : chr [1:3] "1" "2" "3" > .. ..$ : chr [1:4] "Murder" "Assault" "UrbanPop" "Rape" > $ withinss : num [1:3] 19564 9137 19264 > $ size : int [1:3] 16 14 20 > $ tot.withinss: num 47964 <================= > $ betweenss : num 307844 <================= > - attr(*, "class")= chr "kmeans" > ______________________________________________ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel