Please use dput()
On Tue, May 12, 2020 at 7:11 PM Poling, William <poli...@aetna.com> wrote: > Hello Eric, thank you so much for your consideration. > > Here are snippets of data that I hope will be helpful > > WHP > > geo1a <- geo1[, c(2:5)] <-- eliminating ID which is not useful for my > purposes anyway > > #This is for R-Help use > geo1a <- geo1a %>% top_n(25) > > state city latitude longitude > 1 ME FAIRFIELD 44.64485 -69.65948 > 2 ME JONESPORT 44.57935 -67.56743 > 3 ME CASWELL 46.97529 -67.83023 > 4 ME ELLSWORTH 44.52916 -68.38717 > 5 ME VASSALBORO 44.45095 -69.60629 > 6 ME UNION 44.20059 -69.26123 > 7 ME PALERMO 44.45142 -69.41115 > 8 ME ORONO 44.87426 -68.68327 > 9 ME SANGERVILLE 45.10138 -69.33580 > 10 ME ISLESBORO 44.29015 -68.90812 > 11 ME TOPSHAM 43.93600 -69.96565 > 12 ME FREEPORT 43.84089 -70.11160 > 13 ME SKOWHEGAN 44.76687 -69.71644 > 14 ME MILLINOCKET 45.65501 -68.70261 > 15 ME ORRINGTON 44.72417 -68.74026 > 16 ME ST. GEORGE 43.96726 -69.20827 > 17 ME FORT FAIRFIELD 46.80911 -67.88079 > 18 ME MARS HILL 46.56580 -67.89006 > 19 ME FREEPORT 43.85302 -70.03726 > 20 ME EASTON 46.64143 -67.91203 > 21 ME WATERVILLE 44.53621 -69.65913 > 22 ME BRUNSWICK 43.87771 -69.96297 > 23 ME BRUNSWICK 43.91719 -69.89905 > 24 ME BUCKSPORT 44.60665 -68.81892 > 25 ME FAYETTE 44.46380 -70.12047 > > > trnd1_tbla <- trnd1_tbl %>% top_n(25) > print(trnd1_tbla) > head(trnd1_tbla,n=25) > > A tibble: 25 x 5 > city state Basecountsum Basecount2 prop_of_total > <fct> <fct> <dbl> <dbl> <dbl> > 1 ATLANTA GA 2352 12 0.00510 > 2 BRADENTON FL 2352 8 0.00340 > 3 BROOKLYN NY 2352 30 0.0128 > 4 CHARLOTTE NC 2352 8 0.00340 > 5 CHICAGO IL 2352 17 0.00723 > 6 COLUMBUS OH 2352 11 0.00468 > 7 CUMMING GA 2352 8 0.00340 > 8 DALLAS TX 2352 8 0.00340 > 9 ERIE PA 2352 12 0.00510 > 10 HOUSTON TX 2352 12 0.00510 > # ... with 15 more rows > > WHP > > From: Eric Berger <ericjber...@gmail.com> > Sent: Tuesday, May 12, 2020 8:39 AM > To: Poling, William <poli...@aetna.com> > Cc: r-help@r-project.org > Subject: [EXTERNAL] Re: [R] Help with Kmeans output and using broom to > tidy etc.. > > **** External Email - Use Caution **** > Can you create a reproducible example? > Your question involves objects that are unknown to us. (geo1, trnd1_tbl) > > On Tue, May 12, 2020 at 2:41 PM Poling, William via R-help <mailto: > r-help@r-project.org> wrote: > #RStudio Version Version 1.2.1335 need this one--> 1.2.5019 > sessionInfo() > # R version 4.0.0 Patched (2020-05-03 r78349) > #Platform: x86_64-w64-mingw32/x64 (64-bit) > #Running under: Windows 10 x64 (build 17763) > > Hello: > > I have data that I am trying to manipulate for Kmeans clustering. > > Original data looks like this > > str(geo1) > # 'data.frame': 2352 obs. of 5 variables: > # $ ID: Factor w/ 2352 levels "101040199600",..: 590 908 976 509 1674 690 > 1336 86 726 1702 ... > # $ state : Factor w/ 41 levels "AL","AR","AZ",..: 32 10 25 11 9 > 32 13 31 12 12 ... > # $ city : Factor w/ 1337 levels "ABBOTTSTOWN",..: 932 156 230 > 698 965 1330 515 727 1127 1304 ... > # $ latitude : num 40.4 31.2 40.8 42.1 26.8 ... > # $ longitude : num -79.9 -81.5 -74 -91.6 -82.1 ... > > I created a subset adding column prop_of_total > str(trnd1_tbl) > tibble [1,457 x 5] (S3: tbl_df/tbl/data.frame) > $ city : Factor w/ 1337 levels "ABBOTTSTOWN",..: 1 2 3 4 5 6 7 8 > 9 10 ... > $ state : Factor w/ 41 levels "AL","AR","AZ",..: 32 36 10 28 12 36 > 10 11 26 38 ... > $ Basecountsum : num [1:1457] 2352 2352 2352 2352 2352 ... > $ Basecount2 : num [1:1457] 1 1 1 1 1 2 1 1 2 1 ... > $ prop_of_total: num [1:1457] 0.000425 0.000425 0.000425 0.000425 > 0.000425 ... > > > Then I spread it > > trnd2_tbl <- trnd1_tbl %>% > dplyr::select(city, state, prop_of_total) %>% > spread(key = city, value = prop_of_total, fill = 0) #remove the NA's > with fill > > str(trnd2_tbl)#tibble [41 x 1,338] (S3: tbl_df/tbl/data.frame) > > Then I run a Kmeans > > kmeans_obj1 <- trnd2_tbl %>% > dplyr::select(- state) %>% > kmeans(centers = 20, nstart = 100) > > str(kmeans_obj1) > List of 9 > $ cluster : int [1:41] 11 11 9 11 11 4 11 11 16 2 ... > $ centers : num [1:20, 1:1337] 0 0 0 0 0 0 0 0 0 0 ... > ..- attr(*, "dimnames")=List of 2 > .. ..$ : chr [1:20] "1" "2" "3" "4" ... > .. ..$ : chr [1:1337] "ABBOTTSTOWN" "ABILENE" "ACWORTH" "ADAMS" ... > $ totss : num 0.00158 > $ withinss : num [1:20] 0 0 0 0 0 0 0 0 0 0 ... > $ tot.withinss: num 0.0000848 > $ betweenss : num 0.0015 > $ size : int [1:20] 1 1 1 1 1 1 1 1 1 1 ... > $ iter : int 3 > $ ifault : int 0 > - attr(*, "class")= chr "kmeans" > > Then I go and try to tidy: > > #Tidy, glance, augment > #Just makes it easier to use or view the obj's in the obj list > > broom::tidy(kmeans_obj1) %>% glimpse() > > broom::glance(kmeans_obj1) > ##A tibble: 1 x 4 > # totss tot.withinss betweenss iter > # <dbl> <dbl> <dbl> <int> > # 1 0.00158 0.0000848 0.00150 3 > > However, when I run this piece I get an error: > > broom::augment(kmeans_obj1, trnd2_tbl) %>% > dplyr::select(city, .cluster) > > #Error: Must subset columns with a valid subscript vector. > # The subscript has the wrong type `data.frame< > # u: double > # x: double > >`. > i It must be numeric or character. > > Here is the back trace: > > rlang::last_error() > > # Backtrace: > # 1. broom::augment(kmeans_obj1, trnd2_tbl) > # 9. dplyr::select(., city, .cluster) > # 11. tidyselect::vars_select(tbl_vars(.data), !!!enquos(...)) > # 12. tidyselect:::eval_select_impl(...) > # 20. tidyselect:::vars_select_eval(...) > # 21. tidyselect:::walk_data_tree(expr, data_mask, context_mask) > # 22. tidyselect:::eval_c(expr, data_mask, context_mask) > # 23. tidyselect:::reduce_sels(node, data_mask, context_mask, init = init) > # 24. tidyselect:::walk_data_tree(new, data_mask, context_mask) > # 25. tidyselect:::as_indices_sel_impl(...) > # 26. tidyselect:::as_indices_impl(x, vars, strict = strict) > # 27. vctrs::vec_as_subscript(x, logical = "error") > > I am not sure what I am supposed to fix? > > Maybe someone has had similar error and can advise me please? > > Thank you. > > WHP > > > > > > > > Proprietary > > NOTICE TO RECIPIENT OF INFORMATION:\ This e-mail may con...{{dropped:16}} > > ______________________________________________ > mailto:R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Dhelp&d=DwMFaQ&c=wluqKIiwffOpZ6k5sqMWMBOn0vyYnlulRJmmvOXCFpM&r=j7MrcIQm2xjHa8v-2mTpmTCtKvneM2ExlYvnUWbsByY&m=sMhCVDVDKajwJ9te2qVsWXQ2aq4kAe7150EICM51Pw4&s=eSV6ISkAsnmonaRvNdtmx4Lr9vumgXwMYF87DoRP86s&e= > PLEASE do read the posting guide > https://urldefense.proofpoint.com/v2/url?u=http-3A__www.R-2Dproject.org_posting-2Dguide.html&d=DwMFaQ&c=wluqKIiwffOpZ6k5sqMWMBOn0vyYnlulRJmmvOXCFpM&r=j7MrcIQm2xjHa8v-2mTpmTCtKvneM2ExlYvnUWbsByY&m=sMhCVDVDKajwJ9te2qVsWXQ2aq4kAe7150EICM51Pw4&s=8wmXM73ofNcrn1i9gF-qxOzj7zRJZSPcaA5qg0vggG4&e= > and provide commented, minimal, self-contained, reproducible code. > > Proprietary > > NOTICE TO RECIPIENT OF INFORMATION: > This e-mail may contain confidential or privileged information. If you > think you have received this e-mail in error, please advise the sender by > reply e-mail and then delete this e-mail immediately. > This e-mail may also contain protected health information (PHI) with > information about sensitive medical conditions, including, but not limited > to, treatment for substance use disorders, behavioral health, HIV/AIDS, or > pregnancy. This type of information may be protected by various federal > and/or state laws which prohibit any further disclosure without the express > written consent of the person to whom it pertains or as otherwise permitted > by law. Any unauthorized further disclosure may be considered a violation > of federal and/or state law. A general authorization for the release of > medical or other information may NOT be sufficient consent for release of > this type of information. > Thank you. Aetna > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.