Re: [R] Filtering a dataset's columns by another dataset's column names

David Winsemius Fri, 27 Feb 2009 09:42:50 -0800

So you want the data that is in Dataset 1 but only the column namesthat are also in Dataset 2:


How about:


 subset(DS1, select = names(DS1) %in% names(DS2) )

> DS1 <-read.table(textConnection("Individual SNP1 SNP2SNP3 SNP4 SNP5

+ 1    A    G    T    C    A
+ 2    T    C    A    G    T
+ 3    A    C    T    C    A"),header=TRUE)

> DS2 <-read.table(textConnection("Individual SNP1 SNP3SNP5 SNP6 SNP7

+ 4    A    T    T    G    C
+ 5    T    A    A    G    G
+ 6    A    A    T    C    G"),header=TRUE)

> subset(DS1, select= names(DS1) %in% names(DS2) )
  Individual SNP1 SNP3 SNP5
1          1    A    T    A
2          2    T    A    T
3          3    A    T    A

Tested!
--
David Winsemius
Heritage Labs

On Feb 27, 2009, at 12:27 PM, Josh B wrote:

Hello all,

I hope some of you can come to my rescue, yet again.

I have two genetic datasets, and I want one of the datasets to haveonly the columns that are in common with the other dataset.

Here is a toy example (my real datasets have hundreds of columns):

Dataset 1:

Individual    SNP1    SNP2    SNP3    SNP4    SNP5
1    A    G    T    C    A
2    T    C    A    G    T
3    A    C    T    C    A

Dataset 2:

Individual    SNP1    SNP3    SNP5    SNP6    SNP7
4    A    T    T    G    C
5    T    A    A    G    G
6    A    A    T    C    G

I want Dataset1 to have only columns that are also represented inDataset 2, i.e., I want to generate a new Dataset 3 that looks likethis:


Individual    SNP1    SNP3    SNP5
1    A    T    A
2    T    A    T
3    A    T    A

Does anyone know how I could do this? Keep in mind that this is nota simple merge, as in the "merge" function.


Thanks very much for your help everyone.
Josh B.




        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Filtering a dataset's columns by another dataset's column names

Reply via email to