(this is somewhat a change of subject from the original question) Rich, there functions such as aggregate() in base R. There are also many options in CRAN packages.
But I tend to have difficulty getting them to do exactly what I want, and usually end up rolling my own. The idea is to split the data into groups by station and month, then calculate summary stats for each group, then recombine into a new data frame. ## untested with your data, but this kind of approach works well for me ## note that this code assumes easting, northing, and elevation are in fact unique within each group ## if they are not, you will get an ERROR ## add a 'month' variable raindf <- rainfall raindf$mon <- format(raindf$sampdate,'%Y-%m') mysum <- function(df) { data.frame( name=unique(df$name), easting=unique(df$easting), northing=unique(df$northing), elev=unique(df$elev), mon=unique(df$mon), pr.med=median(df$prcp), pr.max=max(df$prcp) ) } tmpdf <- split(raindf, paste(raindf$name, raindf$mon) ) ## at this point, you can check your summary stats function with, for example, mysum(tmpdf[[1]]) mysum(tmpdf[[2]]) ## when satisfied with mysum(), do this tmpsum <- lapply(tmpdf, mysum) ## recombine rain.by.mon <- do.call(rbind, tmpsum) ## might still want to create a numeric month to facilitate plotting ## or maybe assign each month to the first of the month, or the 15th, or end or whatever makes sense rain.by.mon$mondt <- as.Date(paste0(rain.by.mon$mon,'-1')) -- Don MacQueen Lawrence Livermore National Laboratory 7000 East Ave., L-627 Livermore, CA 94550 925-423-1062 Lab cell 925-724-7509 On 9/4/18, 9:41 AM, "R-help on behalf of Rich Shepard" <r-help-boun...@r-project.org on behalf of rshep...@appl-ecosys.com> wrote: On Mon, 3 Sep 2018, Rich Shepard wrote: > Is there a process by which these plots can be 'thinned' so they show the > same overall patterns but with fewer points so they display more quickly? Bert/Paul/David/John: Thanks very much for the suggestions. I think an appropriate way to illustrate the patterns is to plot the median and maximum for each month (for all sites). That's the important information and plotting each daily point over 13 years obscures that information. The dataframe is structured this way: str(rainfall) 'data.frame': 113569 obs. of 6 variables: $ name : chr "Headworks Portland Water" "Headworks Portland Water" "Headworks Portland Water" "Headworks Portland Water" ... $ easting : num 2370575 2370575 2370575 2370575 2370575 ... $ northing: num 199338 199338 199338 199338 199338 ... $ elev : num 228 228 228 228 228 228 228 228 228 228 ... $ sampdate: Date, format: "2005-01-01" "2005-01-02" ... $ prcp : num 0.59 0.08 0.1 0 0 0.02 0.05 0.1 0 0.02 ... There are probably multiple ways of extracting the monthly median and maximum 'prcp' and I don't know how to identify the appropriate one. Is there a task view for this type of data manipulation? I've not before done anything like this and would appreciate a pointer to where I start to learn. Regards, Rich ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.