Am I seeing an odd aspect to this discussion. There are many ways to solve problems and some may be favored by some more than others.
All require some examination of the data so it can be massaged into shape for the processes that follow. If you insist on using the matrix method to arrange that each row or column has the data you want, then, yes, you need to guarantee all your data is present and in the right order. If some may be missing, you may want to write a program that generates all possible dates in order and interpolates them back (or into a copy more likely) so all the missing items are represented and show up as an NA or whatever you want. You may also want to check all dates are in order with no duplicates and anything else that makes sense and then you are free to ask the vector to be seen as a matrix with N columns or rows. For many, the solution is much cleaner to use constructs that may be more resistant to imperfections or allow them to be treated better. I would probably use tidyverse functionality these days but can easily understand people preferring base R or other packages. I have done similar analyses of real data gathered from streams of various chemicals and levels taken at various times and depths including times no measures happened and times there were more than one measure. It is thus much more robust to use methods like group_by and then apply other such verbs already being done grouped and especially when the next steps involved making plots with ggplot. It was rather trivial for example, to replace multiple measures by the average of the measures. And many of my plots are faceted by variables which is not trivial to do in base R. I suggest not falling in love with the first way you think of and try to bend everything to fit. Yes, some methods may be quite a bit more efficient but rarely do I run into problems even with quite large collections of data like a quarter million rows with dozens of columns, including odd columns like the output of some analysis. And note the current set of data may be extended with more over time or you may get other data collected that would not necessarily work well with a hard-coded method but might easily adjust to a new method. -----Original Message----- From: R-help <r-help-boun...@r-project.org> On Behalf Of Rich Shepard Sent: Monday, August 30, 2021 7:34 PM To: R Project Help <r-help@r-project.org> Subject: Re: [R] Calculate daily means from 5-minute interval data On Tue, 31 Aug 2021, Richard O'Keefe wrote: > I made up fake data in order to avoid showing untested code. It's not > part of the process I was recommending. I expect data recorded every N > minutes to use NA when something is missing, not to simply not be > recorded. Well and good, all that means is that reshaping the data is > not a trivial call to matrix(). It does not mean that any additional > package is needed or appropriate and it does not affect the rest of the process. Richard, The instruments in the gauge pipe don't know to write NA when they're not measuring. :-) The outage period varies greatly by location, constituent measured, and other unknown factors. > You will want the POSIXct class, see ?DateTimeClasses. Do you know > whether the time stamps are in universal time or in local time? The data values are not timestamps. There's one column for date a second colume for time and a third column for time zone (P in the case of the west coast. > Above all, it doesn't affect the point that you probably should not be > doing any of this. ? (Doesn't require an explanation.) Rich ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.