Please see ?kmeans and note the "cluster" component of the returned value that would appear to provide the info you seek.
-- Bert Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) On Sat, Dec 8, 2018 at 7:03 AM Bill Poling <bill.pol...@zelis.com> wrote: > Good afternoon. I hope I have provided enough info to get my question > answered. > > I am running windows 10 -- R3.5.1 -- RStudio Version 1.1.456 > > When running a K-Means clustering routine is it possible to get the actual > data from each cluster into a DF? > > I have reviewed a number of tutorials and unless I missed it somewhere I > would like to know if it is possible. > > https://www.datacamp.com/community/tutorials/k-means-clustering-r > https://www.guru99.com/r-k-means-clustering.html > https://datascienceplus.com/k-means-clustering-in-r/ > https://datascienceplus.com/finding-optimal-number-of-clusters/ > http://enhancedatascience.com/2017/10/24/machine-learning-explained-kmeans/ > http://enhancedatascience.com/2017/04/30/r-basics-k-means-r/ > > For example: > > I ran the below and get K-means clustering with 10 clusters of sizes 1511, > 1610, 702, 926, 996, 1076, 580, 2429, 728, 3797 > Can the 1511 values of SavingsReversed and ProviderID , 1610 values of > SavingsReversed and ProviderID, etc.. be run out into DF's? > > Thank you for your help. > > WHP > > str(rr0) > Classes 'data.table' and 'data.frame':14355 obs. of 2 variables: > $ SavingsReversed: num 0 0 61 128 160 ... > $ ProviderID : num 113676 113676 116494 116641 116641 ... > - attr(*, ".internal.selfref")=<externalptr> > > head(rr0, n=35) > SavingsReversed ProviderID > 1: 0.00 113676 > 2: 0.00 113676 > 3: 61.00 116494 > 4: 128.25 116641 > 5: 159.60 116641 > 6: 372.66 119316 > 7: 18.79 121319 > 8: 15.64 121319 > 9: 0.00 121319 > 10: 18.79 121319 > 11: 23.00 121319 > 12: 18.79 121319 > 13: 0.00 121319 > 14: 25.86 121319 > 15: 14.00 121319 > 16: 113.00 121545 > 17: 50.00 121545 > 18: 1155.32 121545 > 19: 113.00 121545 > 20: 197.20 121545 > 21: 0.00 121780 > 22: 36.00 122536 > 23: 1171.32 125198 > 24: 1171.32 125198 > 25: 43.00 125303 > 26: 0.00 125881 > 27: 69.64 128435 > 28: 420.18 128435 > 29: 175.18 128435 > 30: 71.54 128435 > 31: 99.85 128435 > 32: 0.00 128435 > 33: 42.75 128435 > 34: 175.18 128435 > 35: 846.45 128435 > > set.seed(213) > rr0a <- kmeans(rr0, 10) > View(rr0a) > summary(rr0a) > # Length Class Mode > # cluster 14355 -none- numeric > # centers 20 -none- numeric > # totss 1 -none- numeric > # withinss 10 -none- numeric > # tot.withinss 1 -none- numeric > # betweenss 1 -none- numeric > # size 10 -none- numeric > # iter 1 -none- numeric > # ifault 1 -none- numeric > > x1 <- as.data.frame(rr0a$centers) > sort(x1) > #SavingsReversed ProviderID > # 2 75.19665 2773789.2 > # 3 99.31959 4147091.6 > # 5 101.21070 3558532.7 > # 4 103.41147 3893274.4 > # 1 105.38310 2241031.2 > # 8 114.61562 3240701.5 > # 10 121.14184 4718727.6 > # 9 153.70536 4470878.9 > # 6 156.84426 5560636.6 > # 7 185.09745 173732.9 > print(rr0a) > # K-means clustering with 10 clusters of sizes 1511, 1610, 702, 926, 996, > 1076, 580, 2429, 728, 3797 > # > # Cluster means: > # SavingsReversed ProviderID > # 1 105.38310 2241031.2 > # 2 75.19665 2773789.2 > # 3 99.31959 4147091.6 > # 4 103.41147 3893274.4 > # 5 101.21070 3558532.7 > # 6 156.84426 5560636.6 > # 7 185.09745 173732.9 > # 8 114.61562 3240701.5 > # 9 153.70536 4470878.9 > # 10 121.14184 4718727.6 > #Within cluster sum of squares by cluster: > # [1] 74529288379846 25846368411171 4692898666512 6277704963344 > 8428785199973 90824041558798 1468798013919 12143462193009 5483877005233 > # [10] 51547955737867 > # (between_SS / total_SS = 98.7 %) > # > # Available components: > # > # [1] "cluster" "centers" "totss" "withinss" > "tot.withinss" "betweenss" "size" "iter" "ifault" > > > > > > > > > > Confidentiality Notice This message is sent from Zelis. ...{{dropped:13}} > > ______________________________________________ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.