>From the documentation I have found, it seems that one of the functions from package plyr, or a combination of functions like split and lapply would allow me to have a really short R script to analyze all my data (I have reduced it to a couple hundred thousand records with about half a dozen records.
I get the same result from ddply and split/lapply: > ddply(moreinfo,c("m_id","sale_year","sale_week"), > + function(df) data.frame(res = fitdist(df$elapsed_time,"exp"),est = > res$estimate,sd = res$sd)) > Error in fitdist(df$elapsed_time, "exp") : > data must be a numeric vector of length greater than 1 > and > > lapply(split(moreinfo,list(moreinfo$m_id,moreinfo$sale_year,moreinfo$sale_week)), > + function(df) fitdist(df$elapsed_time,"exp")) > Error in fitdist(df$elapsed_time, "exp") : > data must be a numeric vector of length greater than 1 > Now, in retrospect, unless I misunderstood the properties of a data.frame, I suppose a data.frame might not have been entirely appropriate as the m_id samples start and end on very different dates, but I would have thought a list data structure should have been able to handle that. It would seem that split is making groups that have the same start and end dates (or that if, for example, I have sale data for precisely the last year, split would insist on both 2009 and 2010 having weeks from 0 through 52 instead of just the weeks in each year that actually have data: 26 through 52 for last year and 1 through 25 for this year). I don't see how else the data passed to fitdist could have a sample size of 0. I'd appreciate understanding how to resolve this. However, it isn't s show stopper as it now seems trivial to just break it out into a loop (followed by a lapply/split combo using only sale year and sale month). While I am asking, is there a better way to split such temporally ordered data into weekly samples that respective the year in which the sample is taken as well as the week in which it is taken? Thanks Ted [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.