Hello,
I am not at all sure that the following answers the question.
The code below ries to find the optimal number of clusters. One of the
changes I have made to your call to kmeans is to subset DMs not dropping
the dim attribute.
library(cluster)
max_clust <- 10
wss <- numeric(max_clust)
for(k in 1:max_clust) {
km <- kmeans(DMs[,2], centers = k, nstart = 25)
wss[k] <- km$tot.withinss
}
plot(wss, type = "b")
dm <- DMs[, 2, drop = FALSE]
# Where is the elbow, at 2 or at 4?
factoextra::fviz_nbclust(dm, kmeans, method = "wss")
factoextra::fviz_nbclust(dm, kmeans, method = "silhouette")
k2 <- kmeans(dm, centers = 2, nstart = 25)
k3 <- kmeans(dm, centers = 3, nstart = 25)
k4 <- kmeans(dm, centers = 4, nstart = 25)
main2 <- paste(length(k2$centers), "clusters")
main3 <- paste(length(k3$centers), "clusters")
main4 <- paste(length(k4$centers), "clusters")
old_par <- par(mfcol = c(1, 3))
plot(DMs[,2], col = k2$cluster, pch = 19, main = main2)
plot(DMs[,2], col = k3$cluster, pch = 19, main = main3)
plot(DMs[,2], col = k4$cluster, pch = 19, main = main4)
par(old_par)
Hope this helps,
Rui Barradas
Às 12:31 de 05/09/2022, Subhamitra Patra escreveu:
Dear all,
I am about to cluster my datasets by using K-mean clustering techniques in
R, but getting some type of scattered results. Herewith I pasted my code
below. Please suggest to me where I am lacking in my code. I was pasting my
data before applying the K-mean method as follows.
DMs<-read.table(text="Country DATA
IS -0.0092
BA -0.0235
HK -0.0239
JA -0.0333
KU -0.0022
OM -0.0963
QA -0.0706
SK -0.0322
SA -0.1233
SI -0.0141
TA -0.0142
UAE -0.0656
AUS -0.0230
BEL -0.0006
CYP -0.0085
CR -0.0398
DEN -0.0423
EST -0.0604
FIN -0.0227
FRA -0.0085
GER -0.0272
GrE -0.3519
ICE -0.0210
IRE -0.0057
LAT -0.0595
LITH -0.0451
LUXE -0.0023
MAL -0.0351
NETH -0.0048
NOR -0.0495
POL -0.0081
PORT -0.0044
SLOVA -0.1210
SLOVE -0.0031
SPA -0.0213
SWE -0.0106
SWIT -0.0152
UK -0.0030
HUNG -0.0086
CAN -0.0144
CHIL -0.0078
USA -0.0042
BERM -0.0035
AUST -0.0211
NEWZ -0.0538" ,
header = TRUE,stringsAsFactors=FALSE)
library(cluster)
k1<-kmeans(DMs[,2],centers=2,nstart=25)
plot(DMs[,2],col=k1$cluster,pch=19,xlim=c(1,46), ylim=c(-0.12,0))
text(1:46,DMs[,2],DMs[,1],col=k1$cluster)
legend(10,1,c("cluster 1: Highly Integrated","cluster 2: Less Integrated"),
col=1:2,pch=19)
______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.