Dear R users, I have a dataframe where column is has countries, column 2 is dates (monthly) for each countrly, the next 10 columns are my factors where I have measurements for each country and for each date. I have attached a sample of the data in csv format with the data for 3 countries.
I would like to convert my monthly data into quarterly data, finding the mean over 3 month periods for factors a-i, and the sum for factor j. My problem is that not all countries have starting date at the beginning of a quarter for a particular year, ie some countries start in May or September, and also some countries have one extra month, some have two extra months so there's no way of deleting some rows with a simple command (I want to get rid of all extra data that does not fall into the quarters for each country), since the amount of data to get rid of for each country varies. I tried for example: i=1 denise<-data[((data$country)==unique(data$country[i]),] denise[,2]<- as.Date(denise$date, "%Y-%m-%d") denise2<-denise[order(denise[,2],decreasing=FALSE),] len<-length(denise[,1]) limit<-floor(len/3)+1 splitter<-rep(1:limit,each=3) spl.dat<-split(denise2,splitter) new.data<-as.matrix(lapply(spl.dat,FUN="mean")) This finds the mean every 3 rows but this doesnt consider the data quarterly in a calendar sense. ie if the data starts in november, it doesnt discard the data for november, december and start calculating the means from january onwards, until the month where the last quarter finishes, discarding any extra month, or two months at the end. I tried converting my data frame/matrix to a time series but the dates are not kept. I got: >tser<-as.ts(denise) Warning message: In data.matrix(data) : class information lost from one or more columns and column 2 has become a list of numbers rather than dates. I tried: > library(fCalendar) > den.tseries<-as.timeSeries(denise) Warning messages: 1: In .whichFormat(charvec, ...) : Could not determine time(date) format 2: In .whichFormat(charvec, ...) : Could not determine time(date) format > is.timeSeries(den.tseries) [1] TRUE > apply.quarterly(den.tseries,FUN="mean") data 1970-01-01 -2.425000000 1970-04-01 -0.557961111 1970-04-28 0.009814815 Here, it calculates things quarterly but the the as.timeSeries command has assigned its own daily dates to the data, instead of keeping my monthly dates. Also, I don't understand how it deals with the extra dates. Sorry for the long email, Any help would be very much appreciated, Kind regards, Denise
______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.