It looks to me as though the problem is in your input data. The following example works for me:
df1<-data.frame(Date=paste(rep(6:8,each=20), c(sort(sample(1:30,20)),sort(sample(1:30,20)),sort(sample(1:30,20))), 2016,sep="/"), Rainfall_Duration=sample(c(10,20,30,40),60,TRUE)) df1$Date<-as.Date(df1$Date,"%m/%d/%Y") df2<-data.frame(Date=paste(rep(6:8,each=10), c(sort(sample(10:30,10)),sort(sample(10:30,10)),sort(sample(10:30,10))), 2016,sep="/"), Removal.Rate=runif(30,10,60)) df2$Date<-as.Date(df2$Date,"%m/%d/%Y") df3<-data.frame(Rate.Removal.Date=NULL,Date=NULL,Rainfall_Duration=NULL) df3row<-0 for(i in 1:dim(df2)[1]) { rdrows<-which(df2$Date[i] >= df1$Date & !(df2$Date[i] > df1$Date + 8)) # if there are no dates in df1 within the prior 7 days if(!length(rdrows)) { # first check if at least one date in df1 is less than the df2 # date and is not included in the last set of df1 dates checkrows<-which(df2$Date[i] >= df1$Date) # use the last date greater than the maximum in lastrows if(any(checkrows > lastrows)) rdrows<-max(checkrows[checkrows > lastrows]) # otherwise use the last set else rdrows<-lastrows } # save the current set of dates lastrows<-rdrows # get the number of new rows nrows<-length(rdrows) for(row in 1:nrows) { # set the values in each row df3[row+df3row,1]<-format(df2$Date[i],"%m/%d/%Y") df3[row+df3row,2]<-format(df1$Date[rdrows[row]],"%m/%d/%Y") df3[row+df3row,3]<-df1$Rainfall_Duration[rdrows[row]] } df3row<-df3row+nrows } names(df3)<-c("Rate.Removal.Date","Date","Rainfall_Duration") df3 Jim On Fri, Feb 3, 2017 at 4:05 AM, Md Sami Bin Shokrana <samim...@live.com> wrote: > > Hi Jim, > > Thanks a lot. As you already mentioned, this is a clunky method. This code > runs fine with the sample data I provided here. But my real data is much > larger (more than 50 observations for df1) and this code does not work. I > am getting this error: > Error in `*tmp*`[[j]] : subscript out of bounds > Do you have any solution for that? You have already done a lot. So, I > really appreciate your effort. Thanks > > ------------------------------ > *From:* Jim Lemon <drjimle...@gmail.com> > *Sent:* Thursday, February 2, 2017 2:07 PM > > *To:* Md Sami Bin Shokrana; r-help mailing list > *Subject:* Re: [R] sub-setting rows based on dates in R > > Hi Md, > What I have done is to use the most recent intervening date between the > last set of dates if any are there, otherwise the last set of dates. That > is what I understand from your description. > > Remember that this is a very clunky way to do something like this by > adding rows to a data frame, and it is likely to scale up to large data > sets badly. > > df1<-read.table(text="Date Rainfall_Duration > 6/14/2016 10 > 6/15/2016 20 > 6/17/2016 10 > 8/16/2016 30 > 8/19/2016 40 > 8/21/2016 20 > 9/4/2016 10", > header=TRUE,stringsAsFactors=FALSE) > # change the character strings in df2$Date to Date values > df1$Date<-as.Date(df1$Date,"%m/%d/%Y") > > df2<-read.table(text="Date Removal.Rate > 6/17/2016 64.7 > 6/30/2016 22.63 > 7/14/2016 18.18 > 8/19/2016 27.87 > 8/30/2016 23.45 > 9/2/2016 17.2", > header=TRUE,stringsAsFactors=FALSE) > # change the character strings in df2$Date to Date values > df2$Date<-as.Date(df2$Date,"%m/%d/%Y") > > df3<-data.frame(Rate.Removal.Date=NULL,Date=NULL,Rainfall_Duration=NULL) > > df3row<-0 > > for(i in 1:dim(df2)[1]) { > rdrows<-which(df2$Date[i] >= df1$Date & !(df2$Date[i] > df1$Date + 8)) > # if there are no dates in df1 within the prior 7 days > if(!length(rdrows)) { > # first check if at least one date in df1 is less than the df2 > # date and is not included in the last set of df1 dates > checkrows<-which(df2$Date[i] >= df1$Date) > # use the last date greater than the maximum in lastrows > if(any(checkrows > lastrows)) > rdrows<-max(checkrows[checkrows > lastrows]) > # otherwise use the last set > else rdrows<-lastrows > } > # save the current set of dates > lastrows<-rdrows > # get the number of new rows > nrows<-length(rdrows) > for(row in 1:nrows) { > # set the values in each row > df3[row+df3row,1]<-format(df2$Date[i],"%m/%d/%Y") > df3[row+df3row,2]<-format(df1$Date[rdrows[row]],"%m/%d/%Y") > df3[row+df3row,3]<-df1$Rainfall_Duration[rdrows[row]] > } > # keep count of the current number of rows > df3row<-df3row+nrows > } > > names(df3)<-c("Rate.Removal.Date","Date","Rainfall_Duration") > df3 > > Jim > > > On Thu, Feb 2, 2017 at 4:58 AM, Md Sami Bin Shokrana <samim...@live.com> > wrote: > >> Hi Jim, >> >> Thank you so much for your help. Your code works great. Could you please >> explain your code a bit? One more thing, I am so sorry that I forgot to >> mention one more criteria in my post. If it is not much trouble, could you >> please help me out with that? I have added a couple more observations >> (the bold ones) to each of my data frames which are shown below: >> >> >> The main concept is, >> >> (i) For a specific date in df2, if no matching dates are available in df1 >> within the 7 days range, the code will keep on looking for the latest >> available date in df1 with a "Rainfall_Duration" data. For example, in df2, >> for *8/30/2016*, there is no "Rainfall_Duration" data available in >> df1 within the prior 7 days range. So, I want the code to keep on looking >> for dates in df1 until there is an available data for "Rainfall_Duration" >> in df1 (in this case which is * 8/21/2016)* . >> >> >> (ii) Additionally, for* 9/2/2016 *(df2), there is no date available in >> df1 with a "Rainfall_Duration" data within prior 7 days range. The latest >> available data for "Rainfall_Duration" is *8/21/2016*. So, the code will >> extract the same result we had for *8/30/2016* in df2. >> >> In simpler words, i just want the code to keep on looking for data with >> "Rainfall_Duraiton" in df1 if there is none available within the prior 7 >> days range. Sorry for not mentioning it before. >> >> >> >> df1 <- >> >> Date Rainfall_Duration >> 6/14/2016 10 >> 6/15/2016 20 >> 6/17/2016 10 >> 8/16/2016 30 >> 8/19/2016 40 >> *8/21/2016* *20* *9/4/2016 10* >> >> >> df2 <- >> >> Date Removal.Rate >> 6/17/2016 64.7 >> 6/30/2016 22.63 >> 7/14/2016 18.18 >> 8/19/2016 27.87 >> *8/30/2016* *23.45* *9/2/2016 17.2* >> >> >> Expected output: >> >> >> df3 <- >> >> Rate.Removal.Date Date Rainfall_Duration >> 6/17/2016 6/14/2016 10 >> 6/17/2016 6/15/2016 20 >> 6/17/2016 6/17/2016 10 >> 6/30/2016 6/14/2016 10 >> 6/30/2016 6/15/2016 20 >> 6/30/2016 6/17/2016 10 >> 7/14/2016 6/14/2016 10 >> 7/14/2016 6/15/2016 20 >> 7/14/2016 6/17/2016 10 >> 8/19/2016 8/16/2016 30 >> 8/19/2016 8/19/2016 40 >> *8/30/2016* *8/21/2016* *20* *9/2/2016 8/21/2016 >> 20* >> >> >> Thanks in advance. >> >> >> ------------------------------ >> *From:* Jim Lemon <drjimle...@gmail.com> >> *Sent:* Wednesday, February 1, 2017 1:18 PM >> *To:* Md Sami Bin Shokrana; r-help mailing list >> *Subject:* Re: [R] sub-setting rows based on dates in R >> >> Hi Md, >> This kind of clunky, but it might do what you want. >> >> df1<-read.table(text="Date Rainfall_Duration >> 6/14/2016 10 >> 6/15/2016 20 >> 6/17/2016 10 >> 8/16/2016 30 >> 8/19/2016 40", >> header=TRUE,stringsAsFactors=FALSE) >> >> df1$Date<-as.Date(df1$Date,"%m/%d/%Y") >> >> df2<-read.table(text="Date Removal.Rate >> 6/17/2016 64.7 >> 6/30/2016 22.63 >> 7/14/2016 18.18 >> 8/19/2016 27.87", >> header=TRUE,stringsAsFactors=FALSE) >> >> df2$Date<-as.Date(df2$Date,"%m/%d/%Y") >> >> df3<-data.frame(Rate.Removal.Date=NULL,Date=NULL,Rainfall_Duration=NULL) >> >> df3row<-0 >> >> for(i in 1:dim(df2)[1]) { >> rdrows<-which(df2$Date[i] >= df1$Date & !(df2$Date[i] > df1$Date + 8)) >> if(!length(rdrows)) rdrows<-lastrows >> lastrows<-rdrows >> nrows<-length(rdrows) >> for(row in 1:nrows) { >> df3[row+df3row,1]<-format(df2$Date[i],"%m/%d/%Y") >> df3[row+df3row,2]<-format(df1$Date[rdrows[row]],"%m/%d/%Y") >> df3[row+df3row,3]<-df1$Rainfall_Duration[rdrows[row]] >> } >> df3row<-df3row+nrows >> } >> >> names(df3)<-c("Rate.Removal.Date","Date","Rainfall_Duration") >> df3 >> >> Jim >> >> On Wed, Feb 1, 2017 at 3:48 AM, Md Sami Bin Shokrana <samim...@live.com> >> wrote: >> > Hello guys, I am trying to solve a problem in R. I have 2 data frames >> which look like this: >> > df1 <- >> > Date Rainfall_Duration >> > 6/14/2016 10 >> > 6/15/2016 20 >> > 6/17/2016 10 >> > 8/16/2016 30 >> > 8/19/2016 40 >> > >> > df2 <- >> > Date Removal.Rate >> > 6/17/2016 64.7 >> > 6/30/2016 22.63 >> > 7/14/2016 18.18 >> > 8/19/2016 27.87 >> > >> > I want to look up the dates from df2 in df1 and their corresponding >> Rainfall_Duration data. For example, I want to look for the 1st date of df2 >> in df1 and subset rows in df1 for that specific date and 7 days prior to >> that. additionally, for example: for 6/30/2016 (in df2) there is no dates >> available in df1 within it's 7 days range. So, in this case I just want to >> extract the results same as it's previous date (6/17/2016) in df2. Same >> logic goes for 7/14/2016(df2). >> > The output should look like this: >> > >> > df3<- >> > >> > Rate.Removal.Date Date Rainfall_Duration >> > 6/17/2016 6/14/2016 10 >> > 6/17/2016 6/15/2016 20 >> > 6/17/2016 6/17/2016 10 >> > 6/30/2016 6/14/2016 10 >> > 6/30/2016 6/15/2016 20 >> > 6/30/2016 6/17/2016 10 >> > 7/14/2016 6/14/2016 10 >> > 7/14/2016 6/15/2016 20 >> > 7/14/2016 6/17/2016 10 >> > 8/19/2016 8/16/2016 30 >> > 8/19/2016 8/19/2016 40 >> > >> > I could subset data for the 7 days range. But could not do it when no >> dates are available in that range. I have the following code: >> > library(plyr) >> > library (dplyr) >> > df1$Date <- as.Date(df1$Date,format = "%m/%d/%Y") >> > df2$Date <- as.Date(df2$Date,format = "%m/%d/%Y") >> > >> > df3 <- lapply(df2$Date, function(x){ >> > filter(df1, between(Date, x-7, x)) >> > }) >> > >> > names(df3) <- as.character(df2$Date) >> > bind_rows(df3, .id = "Rate.Removal.Date") >> > df3 <- ldply (df3, data.frame, .id = "Rate.Removal.Date") >> > >> > I hope I could explain my problem properly. I would highly appreciate >> if someone can help me out with this code or a new one. Thanks in advance. >> > >> > >> > >> > >> > [[alternative HTML version deleted]] >> > >> > ______________________________________________ >> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> > https://stat.ethz.ch/mailman/listinfo/r-help >> R-help Info Page - Homepage - SfS – Seminar for Statistics >> <https://stat.ethz.ch/mailman/listinfo/r-help> >> stat.ethz.ch >> The main R mailing list, for announcements about the development of R and >> the availability of new code, questions and answers about problems and >> solutions using R ... >> >> >> > PLEASE do read the posting guide http://www.R-project.org/posti >> ng-guide.html >> > and provide commented, minimal, self-contained, reproducible code. >> > > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.