On Fri, May 6, 2011 at 7:41 PM, David Winsemius <dwinsem...@comcast.net> wrote: > > On May 6, 2011, at 11:35 AM, Pete Pete wrote: > >> >> Gabor Grothendieck wrote: >>> >>> On Tue, Dec 7, 2010 at 11:30 AM, Pete Pete <noxyp...@gmail.com> >>> wrote: >>>> >>>> Hi, >>>> consider the following two dataframes: >>>> x1=c("232","3454","3455","342","13") >>>> x2=c("1","1","1","0","0") >>>> data1=data.frame(x1,x2) >>>> >>>> y1=c("232","232","3454","3454","3455","342","13","13","13","13") >>>> y2=c("E1","F3","F5","E1","E2","H4","F8","G3","E1","H2") >>>> data2=data.frame(y1,y2) >>>> >>>> I need a new column in dataframe data1 (x3), which is either 0 or 1 >>>> depending if the value "E1" in y2 of data2 is true while x1=y1. The >>>> result >>>> of data1 should look like this: >>>> x1 x2 x3 >>>> 1 232 1 1 >>>> 2 3454 1 1 >>>> 3 3455 1 0 >>>> 4 342 0 0 >>>> 5 13 0 1 >>>> >>>> I think a SQL command could help me but I am too inexperienced with it >>>> to >>>> get there. >>>> >>> >>> Try this: >>> >>>> library(sqldf) >>>> sqldf("select x1, x2, max(y2 = 'E1') x3 from data1 d1 left join data2 d2 >>>> on (x1 = y1) group by x1, x2 order by d1.rowid") >>> >>> x1 x2 x3 >>> 1 232 1 1 >>> 2 3454 1 1 >>> 3 3455 1 0 >>> 4 342 0 0 >>> 5 13 0 1 >>> >>> > snipped Gabor's sig >> >> That works pretty cool but I need to automate this a bit more. Consider >> the >> following example: >> >> list1=c("A01","B04","A64","G84","F19") >> >> x1=c("232","3454","3455","342","13") >> x2=c("1","1","1","0","0") >> data1=data.frame(x1,x2) >> >> y1=c("232","232","3454","3454","3455","342","13","13","13","13") >> y2=c("E13","B04","F19","A64","E22","H44","F68","G84","F19","A01") >> data2=data.frame(y1,y2) >> >> I want now to creat a loop, which creates for every value in list1 a new >> binary variable in data1. Result should look like: >> x1 x2 A01 B04 A64 G84 F19 >> 232 1 0 1 0 0 0 >> 3454 1 0 0 1 0 1 >> 3455 1 0 0 0 0 0 >> 342 0 0 0 0 0 0 >> 13 0 1 0 0 1 1 > > Loops!?! We don't nee no steenking loops! > >> xtb <- with(data2, table(y1,y2)) >> cbind(data1, xtb[match(data1$x1, rownames(xtb)), ] ) > x1 x2 A01 A64 B04 E13 E22 F19 F68 G84 H44 > 232 232 1 0 0 1 1 0 0 0 0 0 > 3454 3454 1 0 1 0 0 0 1 0 0 0 > 3455 3455 1 0 0 0 0 1 0 0 0 0 > 342 342 0 0 0 0 0 0 0 0 0 1 > 13 13 0 1 0 0 0 0 1 1 1 0 > > I am guessing that you were to ... er, busy? ... to complete the table? > > -- > > David Winsemius, MD > West Hartford, CT > >
Thanks a lot! Pretty simple. I am so much used to SQLDF right now. So how would you handle more complicated strings like that: y1=c("232","232", "232", "3454","3454","3455","342","13","13","13","13") y2=c("E13","B04 A01 F19","B04","F19","A64 G84 A05","E22","H44 C35","F68","G84","F19","A01") data2=data.frame(y1,y2) Where you want to extract for instance all "A01" from the strings? ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.