I'm trying to study times in which flow was operating at a given level or greater. To do so I have created a way to see how long the series has operated at a high level. But for some reason the data is calculating the runs one hour to long. Any ideas on why?
Code: Date<-format(seq(as.POSIXct("2014-01-01 01:00"), as.POSIXct("2015-01-01 00:00"), by="hour"), "%Y-%m-%d %H:%M", usetz = FALSE) Flow<-runif(8760, 0, 2300) IsHigh<- function(x ){ if (x < 1600) return(0) if (1600 <= x) return(1) } isHighFlow = unlist(lapply(Flow, IsHigh)) df = data.frame(Date, Flow, isHighFlow ) temp <- df %>% mutate(highFlowInterval = cumsum(isHighFlow==0)) %>% group_by(highFlowInterval) %>% summarise(hoursHighFlow = n(), minDate = min(as.character(Date)), maxDate = max(as.character(Date))) #Then join the two tables together. temp2<-sqldf("SELECT * FROM temp LEFT JOIN df ON df.Date BETWEEN temp.minDate AND temp.maxDate") -- View this message in context: http://r.789695.n4.nabble.com/How-to-finding-a-given-length-of-runs-in-a-series-of-data-tp4706915.html Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.