Re: [R] Combining Overlapping Data

Sarah Goslee Fri, 11 Nov 2011 15:13:34 -0800

What about merge() with all=FALSE?

> x <- data.frame(a=letters[1:6], b=1:6)
> y <- data.frame(a=letters[4:9], b=11:16)
> x
  a b
1 a 1
2 b 2
3 c 3
4 d 4
5 e 5
6 f 6
> y
  a  b
1 d 11
2 e 12
3 f 13
4 g 14
5 h 15
6 i 16
> merge(x, y, by="a", all=FALSE)
  a b.x b.y
1 d   4  11
2 e   5  12
3 f   6  13
>


If that doesn't work, some sample data would be useful.

Sarah

On Fri, Nov 11, 2011 at 4:07 PM, kickout <kyle.ko...@gmail.com> wrote:
> I've scoured the archives but have found no concrete answer to my question.
>
> Problem: Two data sets
>
> 1st data set(x) = 20,000 rows
> 2nd data set(y) = 5,000 rows
>
> Both have the same column names, the column of interest to me is a variable
> called strain.
>
> For example, a strain named "Chab1405" appears in x 150 times and in y 25
> times...
> strain "Chab1999" only appears 200 times in x and none in y (so i dont want
> that retained).
>
>
> I want to create a new data frame that has all 175 measurements for
> "Chab1405" and any other 'strain' that appears in both the two data sets..
> but not strains that appear in only one data set...So i want the
> intersection of two data sets (maybe?).
>
> I've tried x %in% y, but that only gives TRUE/FALSE
>

-- 
Sarah Goslee
http://www.functionaldiversity.org

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Combining Overlapping Data

Reply via email to