on 02/27/2009 11:27 AM Josh B wrote: > Hello all, > > I hope some of you can come to my rescue, yet again. > > I have two genetic datasets, and I want one of the datasets to have only the > columns that are in common with the other dataset. > Here is a toy example (my real datasets have hundreds of columns): > > Dataset 1: > > Individual SNP1 SNP2 SNP3 SNP4 SNP5 > 1 A G T C A > 2 T C A G T > 3 A C T C A > > Dataset 2: > > Individual SNP1 SNP3 SNP5 SNP6 SNP7 > 4 A T T G C > 5 T A A G G > 6 A A T C G > > I want Dataset1 to have only columns that are also represented in Dataset 2, > i.e., I want to generate a new Dataset 3 that looks like this: > > Individual SNP1 SNP3 SNP5 > 1 A T A > 2 T A T > 3 A T A > > Does anyone know how I could do this? Keep in mind that this is not a simple > merge, as in the "merge" function. > > Thanks very much for your help everyone. > Josh B.
Same.Cols <- intersect(names(DF1), names(DF2)) > Same.Cols [1] "Individual" "SNP1" "SNP3" "SNP5" > rbind(DF1[, Same.Cols], DF2[, Same.Cols]) Individual SNP1 SNP3 SNP5 1 1 A T A 2 2 T A T 3 3 A T A 4 4 A T T 5 5 T A A 6 6 A A T See ?intersect, which gives you the common column names, which you can then use in rbind(). HTH, Marc Schwartz ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.