[R] How to speed up a double loop?

jeff6868 Mon, 02 Mar 2015 05:01:16 -0800

Dear R-users,

I would like to speed up a double-loop I developed for detecting and
removing outliers in my whole data.frame. The idea is to remove data with a
too big difference with the previous value. If detected, this test must be
done here on maximum the next 10 values following the last correct one (and
put an index on another column).


It works well on a small data frame, but really too slowly for my real DF
with 500 000 rows.
Here's a fake data example and the double-loop:

    myts <- data.frame(x=c(1,2,50,40,30,40,100,1,50,1,2,3,3,5,4),y=NA)    
    
    for(jj in 1:(nrow(myts)-10)){
        for(nn in ((jj+1):(jj+10))) {
           if((!is.na(myts[jj,1])) & (!is.na(myts[nn,1])) &
(abs((myts[nn,1])-(myts[jj,1]))>15))
               { myts[nn,2] <- 1
                 myts[nn,1] <- NA } } } 

Can somebody explain me how can I speed this up easily? I heard about
vectorization but I don't really understand how it works.




--
View this message in context: 
http://r.789695.n4.nabble.com/How-to-speed-up-a-double-loop-tp4704054.html
Sent from the R help mailing list archive at Nabble.com.

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] How to speed up a double loop?

Reply via email to