Re: [R] Clustering of datasets

Rui Barradas Mon, 05 Sep 2022 06:03:10 -0700

Hello,

I am not at all sure that the following answers the question.

The code below ries to find the optimal number of clusters. One of thechanges I have made to your call to kmeans is to subset DMs not droppingthe dim attribute.



library(cluster)

max_clust <- 10
wss <- numeric(max_clust)

for(k in 1:max_clust) {
  km <- kmeans(DMs[,2], centers = k, nstart = 25)
  wss[k] <- km$tot.withinss
}
plot(wss, type = "b")

dm <- DMs[, 2, drop = FALSE]
# Where is the elbow, at 2 or at 4?
factoextra::fviz_nbclust(dm, kmeans, method = "wss")
factoextra::fviz_nbclust(dm, kmeans, method = "silhouette")

k2 <- kmeans(dm, centers = 2, nstart = 25)
k3 <- kmeans(dm, centers = 3, nstart = 25)
k4 <- kmeans(dm, centers = 4, nstart = 25)

main2 <- paste(length(k2$centers), "clusters")
main3 <- paste(length(k3$centers), "clusters")
main4 <- paste(length(k4$centers), "clusters")

old_par <- par(mfcol = c(1, 3))
plot(DMs[,2], col = k2$cluster, pch = 19, main = main2)
plot(DMs[,2], col = k3$cluster, pch = 19, main = main3)
plot(DMs[,2], col = k4$cluster, pch = 19, main = main4)
par(old_par)


Hope this helps,

Rui Barradas


Às 12:31 de 05/09/2022, Subhamitra Patra escreveu:

Dear all,

I am about to cluster my datasets by using K-mean clustering techniques in
R, but getting some type of scattered results. Herewith I pasted my code
below. Please suggest to me where I am lacking in my code. I was pasting my
data before applying the K-mean method as follows.

DMs<-read.table(text="Country DATA
                       IS -0.0092
                       BA -0.0235
                       HK -0.0239
                       JA -0.0333
                       KU -0.0022
                       OM -0.0963
                       QA -0.0706
                       SK -0.0322
                       SA -0.1233
                       SI -0.0141
                       TA -0.0142
                       UAE -0.0656
                       AUS -0.0230
                      BEL -0.0006
                      CYP -0.0085
                      CR  -0.0398
                     DEN  -0.0423
                       EST -0.0604
                       FIN -0.0227
                       FRA -0.0085
                      GER -0.0272
                      GrE -0.3519
                      ICE -0.0210
                      IRE -0.0057
                      LAT -0.0595
                     LITH -0.0451
                     LUXE -0.0023
                     MAL  -0.0351
                     NETH -0.0048
                       NOR -0.0495
                       POL -0.0081
                     PORT -0.0044
                     SLOVA -0.1210
                     SLOVE -0.0031
                       SPA -0.0213
                       SWE -0.0106
                     SWIT -0.0152
                       UK -0.0030
                     HUNG -0.0086
                       CAN -0.0144
                     CHIL -0.0078
                       USA -0.0042
                     BERM -0.0035
                     AUST -0.0211
                     NEWZ -0.0538" ,
                  header = TRUE,stringsAsFactors=FALSE)
library(cluster)
k1<-kmeans(DMs[,2],centers=2,nstart=25)
plot(DMs[,2],col=k1$cluster,pch=19,xlim=c(1,46), ylim=c(-0.12,0))
text(1:46,DMs[,2],DMs[,1],col=k1$cluster)
legend(10,1,c("cluster 1: Highly Integrated","cluster 2: Less Integrated"),
col=1:2,pch=19)


______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Clustering of datasets

Reply via email to