Re: [R] Creating binary variable depending on strings of two dataframes

David Winsemius Tue, 10 May 2011 06:22:22 -0700


On May 10, 2011, at 3:18 AM, noxyp...@gmail.com wrote:

On Fri, May 6, 2011 at 7:41 PM, David Winsemius <dwinsem...@comcast.net> wrote:


On May 6, 2011, at 11:35 AM, Pete Pete wrote:

Gabor Grothendieck wrote:
On Tue, Dec 7, 2010 at 11:30 AM, Pete Pete<noxyp...@gmail.com>
wrote:
Hi,
consider the following two dataframes:
x1=c("232","3454","3455","342","13")
x2=c("1","1","1","0","0")
data1=data.frame(x1,x2)

y1=c("232","232","3454","3454","3455","342","13","13","13","13")
y2=c("E1","F3","F5","E1","E2","H4","F8","G3","E1","H2")
data2=data.frame(y1,y2)
I need a new column in dataframe data1 (x3), which is either 0or 1depending if the value "E1" in y2 of data2 is true while x1=y1.The
result
of data1 should look like this:
 x1     x2 x3
1 232   1   1
2 3454 1   1
3 3455 1   0
4 342   0   0
5 13     0   1
I think a SQL command could help me but I am too inexperiencedwith it
to
get there.
Try this:
library(sqldf)
sqldf("select x1, x2, max(y2 = 'E1') x3 from data1 d1 left joindata2 d2
on (x1 = y1) group by x1, x2 order by d1.rowid")
  x1 x2 x3
1  232  1  1
2 3454  1  1
3 3455  1  0
4  342  0  0
5   13  0  1

snipped Gabor's sig

That works pretty cool but I need to automate this a bit more.Consider

the
following example:

list1=c("A01","B04","A64","G84","F19")

x1=c("232","3454","3455","342","13")
x2=c("1","1","1","0","0")
data1=data.frame(x1,x2)

y1=c("232","232","3454","3454","3455","342","13","13","13","13")
y2=c("E13","B04","F19","A64","E22","H44","F68","G84","F19","A01")
data2=data.frame(y1,y2)

I want now to creat a loop, which creates for every value in list1a new

binary variable in data1. Result should look like:
x1      x2      A01     B04     A64     G84     F19
232     1       0       1       0       0       0
3454    1       0       0       1       0       1
3455    1       0       0       0       0       0
342     0       0       0       0       0       0
13      0       1       0       0       1       1


Loops!?! We don't nee no steenking loops!

xtb <-  with(data2, table(y1,y2))
cbind(data1, xtb[match(data1$x1, rownames(xtb)), ] )

      x1 x2 A01 A64 B04 E13 E22 F19 F68 G84 H44
232   232  1   0   0   1   1   0   0   0   0   0
3454 3454  1   0   1   0   0   0   1   0   0   0
3455 3455  1   0   0   0   0   1   0   0   0   0
342   342  0   0   0   0   0   0   0   0   0   1
13     13  0   1   0   0   0   0   1   1   1   0

I am guessing that you were to ... er, busy? ... to complete thetable?


--

David Winsemius, MD
West Hartford, CT


Thanks a lot! Pretty simple. I am so much used to SQLDF right now.

So how would you handle more complicated strings like that:

y1=c("232","232", "232","3454","3454","3455","342","13","13","13","13")

y2=c("E13","B04 A01 F19","B04","F19","A64 G84 A05","E22","H44
C35","F68","G84","F19","A01")
data2=data.frame(y1,y2)

Where you want to extract for instance all "A01" from the strings?

I think you need either to explain what you want in more words of theEnglish language or to offer an example of the desired output. Isuspect you did not want something as simple as this:


> A01.instances <- grep("A01" , data2$y2)
> A01.instances
[1]  2 11
> data2[A01.instances, ]
    y1          y2
2  232 B04 A01 F19
11  13         A01

Or maybe you did?

--
David Winsemius, MD
West Hartford, CT

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Creating binary variable depending on strings of two dataframes

Reply via email to