The example below is a pared-down version of a much larger dataset. My goal is to use the binary data contained in DF$col2 to guide manipulation of the binary data itself, subject to the following:
- Groups of '1' that are separated from other, larger groups of "1's" in 'col2' by 2 or more years should be converted to "0" - Groups of '1' need to be at least 2 consecutive years to be preserved So in the example provided below, DF$col2 would be manipulated such that its values are overrided to: c(0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,0,0,1,1,1,1,1,1,1,1) That is, the first group of 1's in positions 2 through 6 are separated from other groups of 1's by 2 (or more) years, and the second group of 1's (positions 11 & 12) span only a single year and do not meet the criteria of being at least 2 years long. The example R script below shows a small example I'm working with, called "DF". The code that comes after the first line is my attempt to go through some R-gymnastics to append a column to DF called "isl2" that reflects the number of consecutive years in the 0/1 groups, where the +/- sign acts as (or denotes) the original binary condition: 0 = negative, 1 = positive. However, I'm stuck with how to proceed further. Could someone please help me come up with script that modifies DF$col2 shown below to be like that shown above? DF <- data.frame(col1=rep(1991:2004, each=2),col2=c(0,0,1,1,1,1,0,0,0,0,1,1,0,0,1,1,1,1,0,0,1,1,1,1,1,1,1,1)) DF$inc <- c(0, abs(diff(DF$col2))) DF$cum <- cumsum(DF$inc) ex1 <- aggregate(col1 ~ cum, data=DF, function(x) length(unique(x))) names(ex1) <- c('cum','isl') tmp1a <- merge(DF, ex1, by="cum", all.x=TRUE) tmp1a$isl2 <- (-1*tmp1a$col2) * tmp1a$isl tmp1a$isl2[tmp1a$isl2==0] <- tmp1a$isl[tmp1a$isl2==0] DF$grpng <- tmp1a$isl2 At this point I was thinking I could use DF$grpng to sweep through col2 and make adjustments, but I didn't know how to proceed. For debugging purposes, a slightly different example would go from: DF <- data.frame(col1=rep(1991:2004, each=2),col2=c(1,1,1,1, 1,1,0,0,0,0,1,1,1,1,1,1,1,1,0,0,1,1,1,1,1,1,1,1)) to 'col2' looking like: c(0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,0,0,1,1,1,1,1,1,1,1) That is, even though the first group of 1's is greater than two consecutive years, it is separated from a larger group of 1's by 2 (or more years). [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.