You're right: wk <- as.numeric(format(myframe$dates, "%Y.%W")) is.na(wk) <- wk %% 1 == 0 solution<-aggregate(value ~ group + na.locf(wk), myframe, FUN = sum)
On Wed, Mar 30, 2011 at 6:10 PM, Dimitri Liakhovitski <dimitri.liakhovit...@gmail.com> wrote: > Yes, zoo! That's what I forgot. It's great. > Henrique, thanks a lot! One question: > > if the data are as I originally posted - then week numbered 52 is > actually the very first week (it straddles 2008-2009). > What if the data much longer (like in the code below - same as before, > but more dates) so that we have more than 1 year to deal with. > It looks like this code is lumping everything into 52 weeks. And my > goal is to keep each week independent. If I have 2 years, then it > should be 100+ weeks. Makes sense? > Thank you! > > ### Creating a longer example data set: > mydates<-rep(seq(as.Date("2008-12-29"), length = 500, by = "day"),2) > myfactor<-c(rep("group.1",500),rep("group.2",500)) > set.seed(123) > myvalues<-runif(1000,0,1) > myframe<-data.frame(dates=mydates,group=myfactor,value=myvalues) > (myframe) > dim(myframe) > > ## Removing same rows (dates) unsystematically: > set.seed(123) > removed.group1<-sample(1:500,size=150,replace=F) > set.seed(456) > removed.group2<-sample(501:1000,size=150,replace=F) > to.remove<-c(removed.group1,removed.group2);length(to.remove) > to.remove<-to.remove[order(to.remove)] > myframe<-myframe[-to.remove,] > (myframe) > dim(myframe) > names(myframe) > > library(zoo) > wk <- as.numeric(format(myframe$dates, '%W')) > is.na(wk) <- wk == 0 > solution<-aggregate(value ~ group + na.locf(wk), myframe, FUN = sum) > solution<-solution[order(solution$group),] > write.csv(solution,file="test.csv",row.names=F) > > > > On Wed, Mar 30, 2011 at 4:45 PM, Henrique Dallazuanna <www...@gmail.com> > wrote: >> Try this: >> >> library(zoo) >> wk <- as.numeric(format(myframe$dates, '%W')) >> is.na(wk) <- wk == 0 >> aggregate(value ~ group + na.locf(wk), myframe, FUN = sum) >> >> >> >> On Wed, Mar 30, 2011 at 4:35 PM, Dimitri Liakhovitski >> <dimitri.liakhovit...@gmail.com> wrote: >>> Henrique, this is great, thank you! >>> >>> It's almost what I was looking for! Only one small thing - it doesn't >>> "merge" the results for weeks that "straddle" 2 years. In my example - >>> last week of year 2008 and the very first week of 2009 are one week. >>> Any way to "join them"? >>> Asking because in reality I'll have many years and hundreds of groups >>> - hence, it'll be hard to do it manually. >>> >>> >>> BTW - does format(dates,"%Y.%W") always consider weeks as starting with >>> Mondays? >>> >>> Thank you very much! >>> Dimitri >>> >>> >>> On Wed, Mar 30, 2011 at 2:55 PM, Henrique Dallazuanna <www...@gmail.com> >>> wrote: >>>> Try this: >>>> >>>> aggregate(value ~ group + format(dates, "%Y.%W"), myframe, FUN = sum) >>>> >>>> >>>> On Wed, Mar 30, 2011 at 11:23 AM, Dimitri Liakhovitski >>>> <dimitri.liakhovit...@gmail.com> wrote: >>>>> Dear everybody, >>>>> >>>>> I have the following challenge. I have a data set with 2 subgroups, >>>>> dates (days), and corresponding values (see example code below). >>>>> Within each subgroup: I need to aggregate (sum) the values by week - >>>>> for weeks that start on a Monday (for example, 2008-12-29 was a >>>>> Monday). >>>>> I find it difficult because I have missing dates in my data - so that >>>>> sometimes I don't even have the date for some Mondays. So, I can't >>>>> write a proper loop. >>>>> I want my output to look something like this: >>>>> group dates value >>>>> group.1 2008-12-29 3.0937 >>>>> group.1 2009-01-05 3.8833 >>>>> group.1 2009-01-12 1.362 >>>>> ... >>>>> group.2 2008-12-29 2.250 >>>>> group.2 2009-01-05 1.4057 >>>>> group.2 2009-01-12 3.4411 >>>>> ... >>>>> >>>>> Thanks a lot for your suggestions! The code is below: >>>>> Dimitri >>>>> >>>>> ### Creating example data set: >>>>> mydates<-rep(seq(as.Date("2008-12-29"), length = 43, by = "day"),2) >>>>> myfactor<-c(rep("group.1",43),rep("group.2",43)) >>>>> set.seed(123) >>>>> myvalues<-runif(86,0,1) >>>>> myframe<-data.frame(dates=mydates,group=myfactor,value=myvalues) >>>>> (myframe) >>>>> dim(myframe) >>>>> >>>>> ## Removing same rows (dates) unsystematically: >>>>> set.seed(123) >>>>> removed.group1<-sample(1:43,size=11,replace=F) >>>>> set.seed(456) >>>>> removed.group2<-sample(44:86,size=11,replace=F) >>>>> to.remove<-c(removed.group1,removed.group2);length(to.remove) >>>>> to.remove<-to.remove[order(to.remove)] >>>>> myframe<-myframe[-to.remove,] >>>>> (myframe) >>>>> >>>>> >>>>> >>>>> -- >>>>> Dimitri Liakhovitski >>>>> Ninah Consulting >>>>> www.ninah.com >>>>> >>>>> ______________________________________________ >>>>> R-help@r-project.org mailing list >>>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>>> PLEASE do read the posting guide >>>>> http://www.R-project.org/posting-guide.html >>>>> and provide commented, minimal, self-contained, reproducible code. >>>>> >>>> >>>> >>>> >>>> -- >>>> Henrique Dallazuanna >>>> Curitiba-Paraná-Brasil >>>> 25° 25' 40" S 49° 16' 22" O >>>> >>> >>> >>> >>> -- >>> Dimitri Liakhovitski >>> Ninah Consulting >>> www.ninah.com >>> >> >> >> >> -- >> Henrique Dallazuanna >> Curitiba-Paraná-Brasil >> 25° 25' 40" S 49° 16' 22" O >> > > > > -- > Dimitri Liakhovitski > Ninah Consulting > www.ninah.com > -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40" S 49° 16' 22" O ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.