On Sat, Feb 05, 2011 at 11:01:33AM +0100, Sascha Vieweg wrote: > I have got data with one column indicating the area where the data > was recorded: > > R: n <- 43 > R: df <- data.frame("area"=sample(1:7, n, repl=T), "dat"=rnorm(n)) > > In each of the 7 different areas I want to implement one of 7 > specific strategies. The assignment should be random. Therefore, I > pair 7 areas with 7 strategies randomly by > > R: ass <- as.data.frame(cbind("area"=sample(1:7, 7), > "strategy"=sample(1:7, 7))) > > Now I want to create a new variable indicating, which case in the > original data should be assigned to which strategy. I thought > about > > R: x <- numeric(n) > R: for(i in 1:7){ > x[df[, "area"]==i] <- ass[ ass[, "area"]==i , "strategy"] > } > > and then binding the new variable to the data frame > > R: str(df2 <- as.data.frame(cbind(df, "strategy"=x))) > > which works fine. My question is whether there is a more elegant > way?
Hello. If the table "ass" is sorted according to "area", then its second column may be used as a function mapping "area" to "strategy". This leads to the following ass2 <- ass[order(ass[, "area"]), "strategy"] y <- ass2[df[, "area"]] identical(x, y + 0) [1] TRUE This also suggests that the same distribution on the random assignments is obtained, if area is created already sorted and only the second column of "ass" is random ass <- as.data.frame(cbind("area"=1:7, "strategy"=sample(1:7, 7))) Whether creating only this table is sufficient, depends on the application. Hope this helps. Petr Savicky. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.