this can be dome even faster, and I think more easily read, using only base R
d1 <- data.frame(workshop=rep(1:2,4), gender=rep(c("f","m"),each=4)) ## needed by vector and rowbased, not needed by rmh library(tibble) library(plyr) library(magrittr) microbenchmark( vector = {d1 %>% rownames_to_column("ID") %>% mutate( test1 = ifelse(gender == "f" & workshop == 1, 7, 0), test2 = ifelse(gender == "f" & workshop == 1, test1 + 2, 0), test4 = ifelse(gender == "f" & workshop == 1, 1, 0), test5 = test4 ) }, rowbased = {d1 %>% rownames_to_column("ID") %>% mutate(test1 = NA, test2 = NA, test4 = NA, test5 = NA) %>% ddply("ID", within, if (gender == "f" & workshop == 1) { test1 <- 1 test1 <- 6 + test1 test2 <- 2 + test1 test4 <- 1 test5 <- 1 } else { test1 <- test2 <- test4 <- test5 <- 0 })}, rmh={ data.frame(ID=rownames(d1), d1, test1=0, test2=0, test4=0, test5=0) myRowSubset <- d3$gender=="f" & d3$workshop==1 test1 <- 1 d3[myRowSubset, "test1"] <- test1 + 6 d3[myRowSubset, "test2"] <- test1 + 6 + 2 d3[myRowSubset, c("test4", "test5")] <- test1 } ) Unit: microseconds expr min lq mean median uq max neval cld vector 1281.994 1468.102 1669.266 1573.043 1750.354 3171.777 100 a rowbased 8131.230 8691.899 10894.700 9219.882 10435.642 133293.034 100 b rmh 925.571 1056.530 1167.568 1116.425 1221.457 1968.199 100 a On Mon, Dec 17, 2018 at 12:15 PM Thierry Onkelinx via R-help <r-help@r-project.org> wrote: > > Dear Paul, > > R's power is that is works vectorised. Unlike SAS which is rowbased. Using > R in a SAS way will lead to very slow code. > > Your examples can be written vectorised > > d1 %>% > rownames_to_column("ID") %>% > mutate( > test1 = ifelse(gender == "f" & workshop == 1, 7, 0), > test2 = ifelse(gender == "f" & workshop == 1, test1 + 2, 0), > test4 = ifelse(gender == "f" & workshop == 1, 1, 0), > test5 = test4 > ) > > Here is a speed comparison. > > library(microbenchmark) > microbenchmark( > vector = {d1 %>% > rownames_to_column("ID") %>% > mutate( > test1 = ifelse(gender == "f" & workshop == 1, 7, 0), > test2 = ifelse(gender == "f" & workshop == 1, test1 + 2, 0), > test4 = ifelse(gender == "f" & workshop == 1, 1, 0), > test5 = test4 > ) }, > rowbased = {d1 %>% > rownames_to_column("ID") %>% > mutate(test1 = NA, test2 = NA, test4 = NA, test5 = NA) %>% > ddply("ID", > within, > if (gender == "f" & workshop == 1) { > test1 <- 1 > test1 <- 6 + test1 > test2 <- 2 + test1 > test4 <- 1 > test5 <- 1 > } else { > test1 <- test2 <- test4 <- test5 <- 0 > })} > ) > > > Best regards, > > Thierry > > ir. Thierry Onkelinx > Statisticus / Statistician > > Vlaamse Overheid / Government of Flanders > INSTITUUT VOOR NATUUR- EN BOSONDERZOEK / RESEARCH INSTITUTE FOR NATURE AND > FOREST > Team Biometrie & Kwaliteitszorg / Team Biometrics & Quality Assurance > thierry.onkel...@inbo.be > Havenlaan 88 bus 73, 1000 Brussel > www.inbo.be > > /////////////////////////////////////////////////////////////////////////////////////////// > To call in the statistician after the experiment is done may be no more > than asking him to perform a post-mortem examination: he may be able to say > what the experiment died of. ~ Sir Ronald Aylmer Fisher > The plural of anecdote is not data. ~ Roger Brinner > The combination of some data and an aching desire for an answer does not > ensure that a reasonable answer can be extracted from a given body of data. > ~ John Tukey > /////////////////////////////////////////////////////////////////////////////////////////// > > <https://www.inbo.be> > > > Op ma 17 dec. 2018 om 16:30 schreef Paul Miller via R-help < > r-help@r-project.org>: > > > Hello All, > > > > Season's greetings! > > > > Am trying to replicate some SAS code in R. The SAS code uses if-then-do > > code blocks. I've been trying to do likewise in R as that seems to be the > > most reliable way to get the same result. > > > > Below is some toy data and some code that does work. There are some things > > I don't necessarily like about the code though. So I was hoping some people > > could help make it better. One thing I don't like is that the within > > function reverses the order of the computed columns such that test1:test5 > > becomes test5:test1. I've used a mutate to overcome that but would prefer > > not to have to do so. > > > > Another, perhaps very small thing, is the need to calculate an ID > > variable that becomes the basis for a grouping. > > > > I did considerable Internet searching for R code that conditionally > > computes blocks of code. I didn't find much though and so am wondering if > > my search terms were not sufficient or if there is some other reason. It > > occurred to me that maybe if-then-do code blocks like we often see in SAS > > as are frowned upon and therefore not much implemented. > > > > I'd be interested in seeing more R-compatible approaches if this is the > > case. I've learned that it's a mistake to try and make R be like SAS. It's > > better to let R be R. Trouble is I'm not always sure how to do that. > > > > Thanks, > > > > Paul > > > > > > d1 <- data.frame(workshop=rep(1:2,4), > > gender=rep(c("f","m"),each=4)) > > > > library(tibble) > > library(plyr) > > > > d2 <- d1 %>% > > rownames_to_column("ID") %>% > > mutate(test1 = NA, test2 = NA, test4 = NA, test5 = NA) %>% > > ddply("ID", > > within, > > if (gender == "f" & workshop == 1) { > > test1 <- 1 > > test1 <- 6 + test1 > > test2 <- 2 + test1 > > test4 <- 1 > > test5 <- 1 > > } else { > > test1 <- test2 <- test4 <- test5 <- 0 > > }) > > > > ______________________________________________ > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.