Don't use subset for a function name -- it's already the name of a rather important function as is data (but at least that one's not a function in your use so it's not quite so bad). Finally, use dput() when sending data so we get a plaintext reproducible version.
I'd try something like this: dats <- structure(list(Study = c(1L, 1L, 2L, 2L, 3L, 3L), TX = c(1L, 0L, 1L, 0L, 1L, 0L), AEs = c(3L, 2L, 1L, 2L, 1L, 1L), N = c(5L, 7L, 10L, 7L, 8L, 4L)), .Names = c("Study", "TX", "AEs", "N"), class = "data.frame", row.names = c("1", "2", "3", "4", "5", "6")) # See how handy dput can be :-) dats[unlist(mapply(FUN = function(x,y) rep(x, y), 1:NROW(dats), dats$N)), -4] which isn't super elegant, but others might have something better. Best, Michael On Tue, May 15, 2012 at 1:24 AM, Cheenghee AM Koh <sigo...@gmail.com> wrote: > Hello, R-fellows, > > I have a question that I really don't know how to solve. I have spent hours > on line surfing for possible solutions but in veil. Please if anyone could > help me handle this issue, you would be so appreciated! > > I have a "grouped" dataset like this: > >> data > Study TX AEs N > 1 1 1 3 5 > 2 1 0 2 7 > 3 2 1 1 10 > 4 2 0 2 7 > 5 3 1 1 8 > 6 3 0 1 4 > > where Study is the study id, TX is treatment, AEs is how many people in > this trial is positive, and N is the number of the subjects. Therefore, for > the row 1, it stands for: It is the treatment arm for the study one, where > there are 5 subjects and 3 of them are positive. The row 2 stands for: It > is the control arm of the study 1 where there are 7 subjects and 2 of them > are positive. > > Now I would like to "un-group them", make it like: > > Study TX AEs > 1 1 1 > 1 1 1 > 1 1 1 > 1 1 0 > 1 1 0 > 1 0 1 > 1 0 1 > 1 0 0 > 1 0 0 > 1 0 0 > 1 0 0 > 1 0 0 > 2 1 1 > ..................... > ..................... > > > But I wasn't able to do it. In fact I wrote a small function, and use > "lapply" to get what I want. It worked well, and did give me what I want. > But I wasn't able to collapse all the returns into one single data frame > for subsequent analysis. > > The function I wrote: > > subset = function(i){ > d = c(rep(data[i,1], data[i,4]), rep(data[i,2], data[i,4]), rep(0:1, > c(data[i,4] - data[i,3],data[i,3]))) > d = matrix(d, data[i,4],3) > d > } > > then: > > Data = lapply(1:6, subset) > Data > > Therefore, I tried to write a loop. But no matter how I tried, I can't get > what I want. > > Any idea? > > Thank you so much! > > Best, > > > -- > Cheenghee Masaki Koh, MSW, MS(c), PhD Student > School of Social Service Administration > Department of Health Studies, Division of Biological Science > University of Chicago > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.