Here's one way with aggregate() library(car) # You probably will need to install it.
aggregate(DF[,3-4], by=list(years), mean,na.rm=TRUE) recode(x, "c(1,2)='A'; else='B'") DF$years <- recode(DF$years, "c(5,6,7)= '5-7'") DF You may also want to have a look at the reshape and plyr packages. --- On Sun, 4/25/10, steven mosher <mosherste...@gmail.com> wrote: > From: steven mosher <mosherste...@gmail.com> > Subject: [R] Noobie question on aggregate tapply and by > To: "r-help" <r-help@r-project.org> > Received: Sunday, April 25, 2010, 2:29 AM > I have a 43MB dataframe ( 5 > variables) and I'm trying to summarize subsets > of the data. > I've RTFM ( not very clear) and looked at a variety of > samples but cant seem > to figure out > how to make these functions work. > > A sample of what I want to do would be this: > > ids<-seq(1,50) > years<-c(rep(5,10),rep(6,10),rep(7,10),rep(8,20)) > > data<-c(rep(23.2,7),rep(14.2,17),rep(29.2,6),rep(13.4,10),rep(16.3,5), > NA, > rep(40,4)) > data2<-c(rep(22.2,5),rep(13.2,8),NA, > rep(29.8,16),rep(12.4,10),rep(16.3,5), > rep(38,5)) > DF<-data.frame(ids,years,data,data2) > > That will give you a dataframe that is a good analog of > what I have. i > would like to calculate means > ( with NA removed na.rm) for each level of years. > > data data2 > 5 xx. > yy. > 6 xx > yz > 7 ... > ,,, > 8 .. > ... > > And then things like this: > > 5-7 : xx yy > 8 : xy > zz > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org > mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, > reproducible code. > ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.