On Sun, Apr 3, 2011 at 11:58 AM, Mark Novak <mnov...@ucsc.edu> wrote: > # Hi there, > # I am trying to apply a function over a moving-window for a large number of > multivariate time-series that are grouped in a nested set of factors. I > have spent a few days searching for solutions with no luck, so any > suggestions are much appreciated. > > # The data I have are for the abundance dynamics of multiple species > observed in multiple fixed plots at multiple sites. (I total I have 7 > sites, ~3-5 plots/site, ~150 species/plot, for 60 time-steps each.) So my > data look something like this: > > dat<-data.frame(Site=rep(1), Plot=rep(c(rep(1,8),rep(2,8),rep(3,8)),1), > Time=rep(c(1,1,2,2,3,3,4,4)), Sp=rep(1:2), Count=sample(24)) > dat > > # Let the function I want to apply over a right-aligned window of w=2 time > steps be: > cv<-function(x){sd(x)/mean(x)} > w<-2 > > # The final output I want would look something like this: > Out<-data.frame(dat,CV=round(c(NA,NA,runif(6,0,1),c(NA,NA,runif(6,0,1))),2)) > > # I could reshape and apply zoo:rollapply() to a given plot at a given site, > and reshape again as follows: > library(zoo) > a<-subset(dat,Site==1&Plot==1) > b<-reshape(a[-c(1,2)],v.names='Count',idvar='Time',timevar='Sp',direction='wide') > d<-zoo(b[,-1],b[,1]) > d > out<-rollapply(d, w, cv, na.pad=T, align='right') > out > > # I would thereby have to loop through all my sites and plots which, > although it deals with all species at once, still seems exceedingly > inefficient. > > # So the question is, how do I use something like aggregate.zoo or tapply or > even lapply to apply rollapply on each species' time series. > > # The closest I've come is the following two approaches: > > # First let: > datx<-list(Site=dat$Site,Plot=dat$Plot,Sp=dat$Sp) > daty<-dat$Count > > # Method 1. > out1<-tapply(seq(along=daty),datx,function(i,x=daty){ rollapply(zoo(x[i]), > w, cv, na.pad=T, align='right') }) > out1 > out1[,,1] > > # Which "works" in that it gives me the right answers, but in a format from > which I can't figure out how to get back into the format I want. > > # Method 2. > fun<-function(x){y<-zoo(x);coredata(rollapply(y, w, > cv,na.pad=T,align='right'))} > out2<-aggregate(daty,by=datx,fun) > out2 > > # Which superficially "works" better, but again only in a format I can't > figure out how to use because the output seems to be a mix of data.frame and > lists. > out2[1,4] > out2[1,5] > is.data.frame(out2) > is.list(out2) > > # The situation is made more problematic by the fact that the time point of > first survey can differ between plots (e.g., site1-plot3 may only start at > time-point 3). As in... > dat2<-dat > dat2<-dat2[-which(dat2$Plot==3 & dat2$Time<3),] > dat2 > > # I must therefore ensure that I'm keeping track of the true time associated > with each value, not just the order of their occurences. This information > is (seemingly) lost by both methods. > datx<-list(Site=dat2$Site,Plot=dat2$Plot,Sp=dat2$Sp) > daty<-dat2$Count > > # Method 1. > out3<-tapply(seq(along=daty),datx,function(i,x=daty){ rollapply(zoo(x[i]), > w, cv, na.pad=T, align='right') }) > out3 > out3[1,3,1] > time(out3[1,3,1]) > > # Method 2 > out4<-aggregate(daty,by=datx,fun) > out4 > time(out4[3,4]) > > > # Am I going about this all wrong? Is there a different package to try? > Any thoughts and suggestions are much appreciated! > > # R 2.12.2 GUI 1.36 Leopard build 32-bit (5691); zoo 1.6-4 > > # Thanks! > # -mark >
Try ave: dat$cv <- ave(dat$Count, dat[c("Site", "Plot", "Sp")], FUN = function(x) rollapply(zoo(x), 2, cv, na.pad = TRUE, align = "right")) -- Statistics & Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.