I just realized that my method as written only works if the entire date range is within a single calendar year. Here is a revised version that should handle the general case.
In this version I assume that the conversion from the original numbers to dates has already been done e.g. it can be done using ymd() as I showed before library(dplyr) library(lubridate) ## create some example data for 36 months from 2010 d <- seq(from=as.Date("2010-01-01"),to=as.Date("2012-12-31"),by="1 day") conc <- runif(length(d)) # concentrations df <- data.frame( dt=d, conc=conc) df2 <- dplyr::mutate( df, wkpair=as.integer(floor((0:(nrow(df)-1))/14)) ) df3 <- dplyr::group_by(df2,wkpair) %>% dplyr::summarise(from=min(dt), to=max(dt), maxconc=max(conc)) %>% dplyr::select(from,to,maxconc) df3 # A tibble: 79 x 3 from to maxconc <date> <date> <dbl> 1 2010-01-01 2010-01-14 0.928 2 2010-01-15 2010-01-28 0.834 3 2010-01-29 2010-02-11 0.989 4 2010-02-12 2010-02-25 0.983 5 2010-02-26 2010-03-11 0.898 6 2010-03-12 2010-03-25 0.864 7 2010-03-26 2010-04-08 0.920 8 2010-04-09 2010-04-22 0.925 9 2010-04-23 2010-05-06 0.998 10 2010-05-07 2010-05-20 0.761 # ... with 69 more rows On Wed, Jun 19, 2019 at 5:08 PM Eric Berger <ericjber...@gmail.com> wrote: > Hi Bert, > I agree that our interpretation is slightly different, which is why I > wrote: > "If one can take the actual week of the year as an acceptable definition > of week, then here's my approach." > > Regards, > Eric > > On Wed, Jun 19, 2019 at 5:04 PM Bert Gunter <bgunter.4...@gmail.com> > wrote: > >> Eric: >> >> I believe you're doing something different than I did. I broke up each >> month into biweekly periods, 2+ per month. You seem to be grouping the >> overall entire period into biweekly intervals -- apologies if I'm wrong, >> but if I understood correctly, that's not the same thing. I do not know >> which of us -- if either -- has interpreted her query correctly. >> >> Cheers, >> Bert >> >> On Wed, Jun 19, 2019 at 2:35 AM Eric Berger <ericjber...@gmail.com> >> wrote: >> >>> Hi Siti, >>> I didn't test Bert's code but I assume it's fine. :-) >>> I would take a different approach than Bert. I prefer to use a package >>> such as lubridate to handle the date wrangling, and a package such as dplyr >>> to handle the grouping and max extraction. >>> It may be overkill for this problem, but these are great packages to >>> become familiar with. >>> If one can take the actual week of the year as an acceptable definition >>> of week, then here's my approach. >>> >>> library(lubridate) >>> library(dplyr) >>> >>> # Step 1: start with Bert's code to create sample data >>> ## create some example data for 3 months in 2000 >>> d<- 2e7 +c(113:131,201:228, 301:330) ## dates >>> conc <- runif(length(d)) # concentrations >>> >>> # Step 2: collect the data into a data frame >>> df <- data.frame( dt=d, conc=conc) >>> >>> # Step 3: use lubridate's ymd() function to parse the dates, its week() >>> function to identify the week of the year, and define the new column >>> 'wkpair' that groups the weeks 2-at-a-time >>> df2 <- dplyr::mutate( df, >>> wkpair=as.integer(floor(lubridate::week(lubridate::ymd(dt) )/2)) ) >>> >>> # Step 4: group by the wkpair and use dplyr's summarise to get the info >>> you wanted >>> df3 <- dplyr::group_by(df2,wkpair) %>% >>> dplyr::summarise( from=min(dt), to=max(dt), maxconc=max(conc)) >>> %>% >>> dplyr::select(from,to,maxconc) >>> >>> df3 >>> >>> # A tibble: 6 x 3 >>> from to maxconc >>> <dbl> <dbl> <dbl> >>> 1 20000113 20000121 0.963 >>> 2 20000122 20000204 0.988 >>> 3 20000205 20000218 0.939 >>> 4 20000219 20000303 0.883 >>> 5 20000304 20000317 0.863 >>> 6 20000318 20000330 0.765 >>> >>> HTH, >>> Eric >>> >>> >>> >>> On Tue, Jun 18, 2019 at 9:39 PM Bert Gunter <bgunter.4...@gmail.com> >>> wrote: >>> >>>> My apologies. I negected to cc r-help. -- Bert >>>> >>>> >>>> >>>> On Tue, Jun 18, 2019 at 11:21 AM Bert Gunter <bgunter.4...@gmail.com> >>>> wrote: >>>> >>>> > >>>> > I assume that 20000215 means year 2000, month 2, day 15. >>>> > I also assume that you want maxes for the first 2 weeks of a month, >>>> the >>>> > second 2 weeks, and any remaining days. >>>> > I also assume that this might be desired for arbitrary numbers of >>>> years, >>>> > months, and days. >>>> > >>>> > The following is one way to do this. As it's a slight pain to cut and >>>> > paste email data as text into R (use ?dput or R code to run to provide >>>> > example data instead), I just made up my own. You'll have to do the >>>> > following within a data frame through extraction or by using with() of >>>> > course. >>>> > >>>> > ## create some example data for 3 months in 2000 >>>> > d<- 2e7 +c(113:131,201:228, 301:330) ## dates >>>> > conc <- runif(length(d)) # concentrations >>>> > >>>> > ## convert the date to character to extract year, month, and day >>>> > cdate <- as.character(d) >>>> > ## use substr to to the extraction >>>> > year <- substr(cdate,1,4) >>>> > mon <- substr(cdate,5,6) >>>> > day <- substr(cdate, 7,8) >>>> > >>>> > ## convert day to numeric and use cut() to group into the biweekly >>>> periods. >>>> > d14 <- cut(as.numeric(day), c(0,14.5,28.5, 32)) >>>> > >>>> > ## Use tapply() to create your desired table of results. >>>> > tapply(conc, list(year, d14, mon), max, na.rm = TRUE) >>>> > >>>> > ## Results >>>> > >>>> > , , 01 >>>> > >>>> > (0,14.5] (14.5,28.5] (28.5,32] >>>> > 2000 0.7357389 0.9655391 0.7962965 >>>> > >>>> > , , 02 >>>> > >>>> > (0,14.5] (14.5,28.5] (28.5,32] >>>> > 2000 0.8193979 0.9487207 NA >>>> > >>>> > , , 03 >>>> > >>>> > (0,14.5] (14.5,28.5] (28.5,32] >>>> > 2000 0.9718919 0.9997093 0.168659 >>>> > >>>> > >>>> > Cheers, >>>> > Bert >>>> > >>>> > Bert Gunter >>>> > >>>> > >>>> > >>>> > >>>> > On Tue, Jun 18, 2019 at 8:53 AM SITI AISYAH BINTI ZAKARIA < >>>> > aisyahzaka...@unimap.edu.my> wrote: >>>> > >>>> >> Hi, >>>> >> >>>> >> I'm Aisyah..I have a problem to run my R coding. I want to select >>>> maximum >>>> >> value according to week. >>>> >> >>>> >> here is my data >>>> >> >>>> >> Date O3_Conc >>>> >> 20000101 0.033 >>>> >> 20000102 0.023 >>>> >> 20000103 0.025 >>>> >> 20000104 0.041 >>>> >> 20000105 0.063 >>>> >> 20000106 0.028 >>>> >> 20000107 0.068 >>>> >> 20000108 0.048 >>>> >> 20000109 0.037 >>>> >> 20000110 0.042 >>>> >> 20000111 0.027 >>>> >> 20000112 0.035 >>>> >> 20000113 0.063 >>>> >> 20000114 0.035 >>>> >> 20000115 0.042 >>>> >> 20000116 0.028 >>>> >> >>>> >> I want to find the max value from column O3_Conc for only 14 days >>>> that >>>> >> refer to biweekly in month. And the next 14 days for the max value. >>>> >> >>>> >> I hope that I can get the result like this: >>>> >> >>>> >> Date Max O3_Conc >>>> >> 20000101 - 20000114 0.068 >>>> >> 20000115 - 20000129 0.061 >>>> >> >>>> >> I try many coding but still unavailable. >>>> >> >>>> >> this example my coding >>>> >> >>>> >> library(plyr) >>>> >> data.frame(CA0003) >>>> >> >>>> >> # format weeks as per requirement (replace "00" with "52" and >>>> >> adjust corresponding year) >>>> >> tmp <- list() >>>> >> tmp$y <- format(df$Date, format="%Y") >>>> >> tmp$w <- format(df$Date, format="%U") >>>> >> tmp$y[tmp$w=="00"] <- >>>> as.character(as.numeric(tmp$y[tmp$w=="00"]) - >>>> >> 14) >>>> >> tmp$w[tmp$w=="00"] <- "884" >>>> >> df$week <- paste(tmp$y, tmp$w, sep = "-") >>>> >> >>>> >> # get summary >>>> >> df2 <- ddply(df, .(week),transform, O3_Conc=max(O3_Conc)) >>>> >> >>>> >> # include week ending date >>>> >> tmp$week.ending <- lapply(df2$week, function(x) rev(df[df$week >>>> ==x, >>>> >> "O3_Conc"])[[1]]) >>>> >> df2$week.ending <- sapply(tmp$week.ending, max(O3_Conc, TRUE) >>>> >> >>>> >> output >>>> >> Site_Id Site_Location >>>> Date >>>> >> Year O3_Conc Month Day week >>>> >> 1 CA0003 Sek. Keb. Cederawasih, Taman Inderawasih, Perai >>>> 20000101 >>>> >> 2000 0.033 1 1 NULL-NULL >>>> >> 2 CA0003 Sek. Keb. Cederawasih, Taman Inderawasih, Perai >>>> 20000102 >>>> >> 2000 0.023 1 2 NULL-NULL >>>> >> 3 CA0003 Sek. Keb. Cederawasih, Taman Inderawasih, Perai >>>> 20000103 >>>> >> 2000 0.025 1 3 NULL-NULL >>>> >> 4 CA0003 Sek. Keb. Cederawasih, Taman Inderawasih, Perai >>>> 20000104 >>>> >> 2000 0.041 1 4 NULL-NULL >>>> >> 5 CA0003 Sek. Keb. Cederawasih, Taman Inderawasih, Perai >>>> 20000105 >>>> >> 2000 0.063 1 5 NULL-NULL >>>> >> 6 CA0003 Sek. Keb. Cederawasih, Taman Inderawasih, Perai >>>> 20000106 >>>> >> 2000 0.028 1 6 NULL-NULL >>>> >> 7 CA0003 Sek. Keb. Cederawasih, Taman Inderawasih, Perai >>>> 20000107 >>>> >> 2000 0.068 1 7 NULL-NULL >>>> >> 8 CA0003 Sek. Keb. Cederawasih, Taman Inderawasih, Perai >>>> 20000108 >>>> >> 2000 0.048 1 8 NULL-NULL >>>> >> 9 CA0003 Sek. Keb. Cederawasih, Taman Inderawasih, Perai >>>> 20000109 >>>> >> 2000 0.037 1 9 NULL-NULL >>>> >> 10 CA0003 Sek. Keb. Cederawasih, Taman Inderawasih, Perai >>>> 20000110 >>>> >> 2000 0.042 1 10 NULL-NULL >>>> >> 11 CA0003 Sek. Keb. Cederawasih, Taman Inderawasih, Perai >>>> 20000111 >>>> >> 2000 0.027 1 11 NULL-NULL >>>> >> >>>> >> >>>> >> >>>> >> >>>> >> -- >>>> >> This message has been scanned by E.F.A. Project and is believed to be >>>> >> clean. >>>> >> >>>> >> ______________________________________________ >>>> >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >>>> >> https://stat.ethz.ch/mailman/listinfo/r-help >>>> >> PLEASE do read the posting guide >>>> >> http://www.R-project.org/posting-guide.html >>>> >> and provide commented, minimal, self-contained, reproducible code. >>>> >> >>>> > >>>> >>>> [[alternative HTML version deleted]] >>>> >>>> ______________________________________________ >>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>> PLEASE do read the posting guide >>>> http://www.R-project.org/posting-guide.html >>>> and provide commented, minimal, self-contained, reproducible code. >>>> >>> [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.