I think Rich has shared aspects of the data before and may have forgotten we want something here and now.
Besides a small sample of what the relevant columns look like and a suggestion of what he wants some new column to look like, we probably need more to understand what he wants. The issue could be bit like people who want to group their data by quarter, for example, or by some other aspect such as when someone started and ended one topic and switched to another. No way we can guess what he actually wants. What Rich writes may be perfectly clear to him but not others. It does sound like there are periods people sit there and record measurements in seemingly multiple (?contiguous) records with each recording the time at intervals such as every five minutes, and/or 10 or 30. So a wild guess might be to cluster them together by finding a GAP where the next record is close enough in time to the previous ones. In essence, the condition seems to be that: time-of-current-record - time-of-previous-record > threshold Where threshold may simply be thirty minutes, assuming that all the records are also in the same series as in locations of measurement and do not intertwine. I assume, as usual, there are umpteen ways to deal with such sliding window problems but am loathe to suggest any ideas till Rich has more clearly defined the issue, perhaps by including a small amount of data in a format trivial to copy/paste into our R implementation to play with and verify that the solution seems to work. But very loosely speaking, a simple sliding window of one might work. In base R, you can use some form of loop, obviously, starting with column 2, that perhaps uses a comparison from row N to row N-1 and sets some new column value to something like 1 until it encounters a big enough gap when it starts setting it to2 and so on. A later pass on the new data could use grouping by that column, IF all of what I assume makes sense. And, of course, the tidyverse has perhaps easier to use functionality such as their non-base functions of lag() and lead() used within something like mutate() https://dplyr.tidyverse.org/reference/lead-lag.html But again, you need clearer requirements. You asked how to find when DATES change. That is not the same as my guess as the date changes at midnight local time so measures seconds apart would change. If you want to know when clusters of non-overlapping measures change, that is another issue. And what exactly do you want to do after determining when things change? Depending on what you want, you may need a different way to solve the initial problem. I mentioned the idea of grouping by another variable you create as one such possibility. But many other solutions would not make a grouping variable on every row, but insert some kind of cut mark in just the first row or add a special row between groups and anything lese your imagination supplies. Clearly, you do not want us to solve the entire problem you are working on, but more context may get you answers to the specific thing you are working on. And, note that adding a new time column may not be required as they can be created on the fly too in some places, given the other columns. But it does help to have it in place, at least for a while, if you want to provide answers such as how many measures were made in what total amount of time (first to last.) -----Original Message----- From: R-help <r-help-boun...@r-project.org> On Behalf Of jim holtman Sent: Wednesday, December 15, 2021 1:05 PM To: Rich Shepard <rshep...@appl-ecosys.com> Cc: R mailing list <r-help@r-project.org> Subject: Re: [R] Changing time intervals in data set At least show a sample of the data and then what you would like as output. Thanks Jim Holtman *Data Munger Guru* *What is the problem that you are trying to solve?Tell me what you want to do, not how you want to do it.* On Wed, Dec 15, 2021 at 6:40 AM Rich Shepard <rshep...@appl-ecosys.com> wrote: > A 33-year set of river discharge data at one gauge location has > recording intervals of 5, 10, and 30 minutes over the period of record. > > The data.frame/tibble has columns for year, month, day, hour, minute, > and datetime. > > Would difftime() allow me to find the dates when the changes occurred? > > TIA, > > Rich > > ______________________________________________ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.