On Sep 22, 2009, at 5:58 PM, Corey Sparks wrote:

Dear R users,
I am interested in taking the columns from multiple dataframes, the problem is that the different dataframes have different combinations of the same variable names, here's a simple example:
a<-rep(1:10)
b<-rep(1:10)
c<-rep(21:30)
d<-rep(31:40)

dat.a<-data.frame(a,b,c,d)
names(dat.a)<-c("a", "b", "c", "d")

dat.b<-data.frame(a,c,d)
names(dat.b)<-c("a", "c", "d")

I would like to first see if the names in the larger dataframe match those of the smaller (they have the same variables)

names(dat.a)%in%names(dat.b)


Could anyone help with this problem, I would basically like to form a subset of the dat.a that matches the variable names in dat.b. If there were only a few variables, this would be easier, but I have between 4 and 5 thousand variables in each dataset

I have never tried this on the scale you propose, but on your toy example, here's what works;

> names(dat.a)%in%names(dat.b) # your code which returns a logical vector
[1]  TRUE FALSE  TRUE  TRUE

> subset(dat.a, select= names(dat.a)%in%names(dat.b) )
    a  c  d
1   1 21 31
2   2 22 32
3   3 23 33
4   4 24 34
5   5 25 35
6   6 26 36
7   7 27 37
8   8 28 38
9   9 29 39
10 10 30 40



--

David Winsemius, MD
Heritage Laboratories
West Hartford, CT

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to