Hello All,

Would like to be able to summarize across in dplyr using variable names and a 
condition. Below is an example "have" data set followed by an example "need" 
data set. After that, I've got a vector of numeric variable names. After that, 
I've got the very humble beginnings of a dplyr-based solution.

What I think I need to be able to do is to submit my variable names to dplyr 
and then to have a conditional function. If the variable is is in my list of 
names, calculate the mean and the std. If not, then calculate the mean but 
label it as a proportion. The question is how to do that. It appears that using 
variable names might involve !!, or possibly enquo, or possibly quo, but I 
haven't had much success with these. I imagine I might have been very close but 
not quite have gotten it. The conditional part seems less difficult but I'm not 
quite sure how to do that either.

Help with this would be greatly appreciated.

Thanks,

Paul


have <- structure(list(
        ptno = c("A", "B", "C", "D", "E", "F", "G", "H", "I", "J", "K", "L", 
"M",
                 "N", "O", "P", "Q", "R", "S", "T", "U", "V", "W", "X", "Y", 
"Z"),
        age1 = c(74, 70, 78, 79, 72, 81, 76, 58, 53, 74, 72, 74, 75,
                 73, 80, 62, 67, 65, 83, 67, 72, 90, 73, 84, 90, 51),
        age2 = c(71, 67, 72, 74, 65, 79, 70, 49, 45, 68, 70, 71, 74,
                 71, 69, 58, 65, 59, 80, 60, 68, 87, 71, 82, 80, 49),
        gender_male = c(1L, 1L, 0L, 1L, 0L, 1L, 0L, 1L, 1L, 1L, 0L, 1L, 0L,
                        1L, 0L, 0L, 1L, 1L, 1L, 0L, 0L, 1L, 0L, 1L, 1L, 0L),
        gender_female = c(0L, 0L, 1L, 0L, 1L, 0L, 1L, 0L, 0L, 0L, 1L, 0L, 1L,
                          0L, 1L, 1L, 0L, 0L, 0L, 1L, 1L, 0L, 1L, 0L, 0L, 1L),
        race_white = c(0L, 1L, 0L, 1L, 1L, 1L, 1L, 1L, 0L, 1L, 1L, 1L, 0L,
                       1L, 1L, 0L, 1L, 1L, 0L, 1L, 1L, 1L, 1L, 1L, 1L, 1L),
        race_black = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L,
                       0L, 0L, 0L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L),
        race_other = c(1L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 1L, 0L, 0L, 0L, 1L,
                       0L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L)),
        row.names = c(NA, -26L), class = c("tbl_df", "tbl", "data.frame"))
 

need <-structure(list(
       age1_mean = 72.8076923076923, age1_std = 9.72838827666425,
       age2_mean = 68.2307692307692, age2_std = 10.2227498934785,
       gender_male_prop = 0.576923076923077, gender_female_prop = 
0.423076923076923,
       race_white_prop = 0.769230769230769, race_black_prop = 
0.0384615384615385,
       race_other_prop = 0.192307692307692),
       row.names = c(NA, -1L), class = c("tbl_df", "tbl", "data.frame"))

vars_num <-  c("age1", "age2")

library(magrittr)
library(dplyr)

have %>%
  summarise(across(
  .cols = !contains("ptno"),
  .fns = list(mean = mean, std = sd),
  .names = "{col}_{fn}"
))

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to