Hi everybody,

I have a small problem in a function, about removing short sequences of
identical numeric values.

For the example, we can consider this data, containing only some "0" and
"1":

test <- data.frame(x=c(0,0,1,1,1,0,0,0,0,1,1,1,1,1,1,1,1))

The aim of my purpose here is simply to remove each sequence of "1" with a
length shorter than 5, and to keep sequences of "1" which are bigger than 5.
So my final data should look like this:

final <- data.frame(x=c(0,0,NA,NA,NA,0,0,0,0,1,1,1,1,1,1,1,1))

For the moment, I have this function:

    foo <- function(X,N){
      tab <- table(X[X==1])
      under.n <- as.numeric(names(tab)[tab<N]) 
      ind <- X %in% under.n
      Ind.sup <- which(ind)
      X <- ifelse(ind,NA,X)
    }

test$x <- apply(as.data.frame(test$x),2,function(x) foo(x,5))

The problem is that the function doesn't consider each sequence separately,
but only one sequence. I think that adding rle() instead of table() in my
function should to the trick, but it doesn't work yet. 
Does someone have an idea about fixing this problem?





--
View this message in context: 
http://r.789695.n4.nabble.com/find-remove-sequences-of-at-least-N-values-for-a-specific-value-tp4693810.html
Sent from the R help mailing list archive at Nabble.com.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to