On Sep 22, 2009, at 5:58 PM, Corey Sparks wrote:
Dear R users,
I am interested in taking the columns from multiple dataframes, the
problem is that the different dataframes have different combinations
of the same variable names, here's a simple example:
a<-rep(1:10)
b<-rep(1:10)
c<-rep(21:30)
d<-rep(31:40)
dat.a<-data.frame(a,b,c,d)
names(dat.a)<-c("a", "b", "c", "d")
dat.b<-data.frame(a,c,d)
names(dat.b)<-c("a", "c", "d")
I would like to first see if the names in the larger dataframe match
those of the smaller (they have the same variables)
names(dat.a)%in%names(dat.b)
Could anyone help with this problem, I would basically like to form
a subset of the dat.a that matches the variable names in dat.b. If
there were only a few variables, this would be easier, but I have
between 4 and 5 thousand variables in each dataset
I have never tried this on the scale you propose, but on your toy
example, here's what works;
> names(dat.a)%in%names(dat.b) # your code which returns a logical
vector
[1] TRUE FALSE TRUE TRUE
> subset(dat.a, select= names(dat.a)%in%names(dat.b) )
a c d
1 1 21 31
2 2 22 32
3 3 23 33
4 4 24 34
5 5 25 35
6 6 26 36
7 7 27 37
8 8 28 38
9 9 29 39
10 10 30 40
--
David Winsemius, MD
Heritage Laboratories
West Hartford, CT
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.