On Fri, Dec 31, 2010 at 01:51:18AM -0800, Sarah wrote: > > Dear all, > > I'm having trouble with my dataframe, and I hope someone can help me out... > I have data from 40 subjects, displayed in a dataframe. I have randomly > assigned subjects to group 1 or 0 (mar.y==0 or mar.y==1, with probabilities > used). > In the end, I want 34 cases assigned to group 0, with the rest of the > subjects assigned to group 1. However, if there are more than 34 cases > assigned to group 0 due to the randomness, I would like to keep 34 cases in > group 0 (this is already written in my script below), but with the rest of > the cases assigned to group 1. (Vice versa, if there are less than 34 cases > assigned to group 0, I would like to sample cases from group 1 and put them > in group 0, while retaining the rest of group 1 in my dataframe.) > I can't figure out how to keep 34 cases in group 0, WHILE assigning the rest > of the cases a value 1 (mar.y==1)... > > if (length(which(df$mar.y==0))>34) { > df <- df[sample(which(df$mar.y==0),34), ] > } else { > df <- df[c(which(df$mar.y==0), > sample(which(df$mar.y==1),34-length(which(df$mar.y==0)))), ] > }
I am not sure, what is the question. According to my tests, this code works, if you want to rewrite df by a data frame with exactly 34 cases. The command sample(which(...)) is slightly dangerous, since if which() produces only one index, say i, then sample(which()) samples from 1:i. However, with the parameters 34 and 40, your code uses sample() to vectors of length at least 35 or at least 40 - 34. If you want to keep all cases and only reassign the groups, you can either modify df$mar.y (and not the whole df) or introduce a new column of df with the index of the new group. Petr Savicky. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.