> -----Original Message----- > From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Kevin > E. Thorpe > Sent: Tuesday, March 31, 2015 10:53 AM > To: Duncan Murdoch > Cc: R Help Mailing List > Subject: Re: [R] Randomly interleaving data frames while preserving > order > > On 03/31/2015 01:44 PM, Duncan Murdoch wrote: > > On 31/03/2015 1:05 PM, Kevin E. Thorpe wrote: > >> Hello. > >> > >> I am trying to simulate recruitment in a randomized trial. Suppose I > >> have three streams (strata) of patients represented by these data > frames. > >> > >> df1 <- data.frame(strat=rep(1,10),id=1:10,pid=1001:1010) > >> df2 <- data.frame(strat=rep(2,10),id=1:10,pid=2001:2010) > >> df3 <- data.frame(strat=rep(3,10),id=1:10,pid=3001:3010) > >> > >> What I need to do is construct a data frame with all of these > combined > >> where the order of selection from one of the three data frames is > >> randomized but once a stratum is selected patients are selected > >> sequentially from that data frame. > >> > >> To see what I'm looking to achieve, suppose the first five subjects > were > >> to come, in order, from strata (data frames) 1, 2, 1, 3 and 2. The > >> expected result should look like this: > >> > >> rbind(df1[1,],df2[1,],df1[2,],df3[1,],df2[2,]) > >> strat id pid > >> 1 1 1 1001 > >> 2 2 1 2001 > >> 21 1 2 1002 > >> 4 3 1 3001 > >> 22 2 2 2002 > >> > >> I hope what I'm trying to accomplish makes sense. Maybe I'm missing > >> something obvious, but I really have no idea at the moment how to > >> achieve this elegantly. Since I need to simulate many trial > recruitments > >> it needs to be general and compact. > >> > >> I appreciate any advice. > > > > How about something like this: > > > > # Permute an ordered vector of selections: > > sel <- sample(c(rep(1, nrow(df1)), rep(2, nrow(df2)), rep(3, > nrow(df3)))) > > > > # Create an empty dataframe to hold the results > > df <- data.frame(strat=NA, id=NA, pid=NA)[rep(1, length(sel)),] > > > > # Put the original dataframes into the appropriate slots: > > df[sel == 1,] <- df1 > > df[sel == 2,] <- df2 > > df[sel == 3,] <- df3 > > > > # Clean up the rownames > > rownames(df) <- NULL > > > > Duncan Murdoch > > > > Thanks Duncan. > > Once you see the solution it is indeed obvious. > > Kevin > > -- > Kevin E. Thorpe > Head of Biostatistics, Applied Health Research Centre (AHRC) > Li Ka Shing Knowledge Institute of St. Michael's > Assistant Professor, Dalla Lana School of Public Health > University of Toronto > email: kevin.tho...@utoronto.ca Tel: 416.864.5776 Fax: 416.864.3016 >
Another option would be to stack your strata and then sample from the combined data frame, something like this: sample_size <- 10 population <- rbind(df1,df2,df3) sim.sample <- pop[sample(nrow(pop),sample_size, replace=FALSE),] Hope this is helpful, Dan Daniel J. Nordlund, PhD Research and Data Analysis Division Services & Enterprise Support Administration Washington State Department of Social and Health Services ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.