On Mon, Nov 8, 2010 at 3:16 PM, Richard Vlasimsky
<richard.vlasim...@imidex.com> wrote:
> Does anyone recommend a more efficient way to "roll" values in a time series 
> dataset?
>
> I merged a bunch of different time series datasets (10's of thousands of 
> them) whose observation dates and sampling interval differ.  Some time series 
> observations are reported at the beginning of the month, some at the end, 
> some on Mondays, some on Wednesday, some annually, etc.
>
> In the process of merging all of the irregular time series (by date 
> observed), a significant number of NA's appear in the dataset where I really 
> want the last reported value 'rolled'  forward.
>
> To use a concrete example, a time series that has reported values at the 
> beginning of every month shows NA's for every day except the date it was 
> reported (in this case, the first of the month).  I want the value to roll 
> forward so that NA's after the first of the month are replaced with a last 
> reported value.
>
> I wrote the following for loop to accomplish the task on the object 
> 'dataset', however it is far to slow too process 10's of thousands of 
> different time series with 15,000 observations each.  At this rate it is 
> going, it would take weeks to complete.
>
> for(j in 1:length(names(dataset)))
> {
>        last<-NA;
>        for(i in 1:length(row.names(dataset)))
>                        ifelse(is.na(dataset[i,j]), test[i,j] <- last, 
> last<-dataset[i,j]);
>
> }
>
> One would think a rather simple operation as this could perform much faster.  
> My sense is using the "apply" function is the way to go, however I just can't 
> get my head around a function that would reference the last reported value.
>
> Any guidance is appreciated.
>

Don't know if its fast enough for you but in zoo you can merge and
carry the last occurrence forward like this:

# suppose z1, z2, z3 are zoo series

na.locf(merge(z1, z2, z3)) # as many as you like

or

L <- list(z1, z2, z3)
na.locf(do.call("merge", L))

which produces a multivariate series, one per column with NAs filled in.

-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to