huali, if i were you, i will create a view on the MySql server to aggregate the data first and then use R to pull the data through this created view. This is not only applicable to R but also a general guideline in similar situation. Per my understanding and experience, R is able to do data manipulation reasonably well. However, we should always use the right tool to do the right thing.
On Jan 26, 2008 6:45 PM, zhihuali <[EMAIL PROTECTED]> wrote: > > Hi, netters, > > First of all, thanks a lot for all the prompt replies to my earlier question > about "merging" data frames in R. > Actually that's an equivalence to the "join" clause in mysql. > > Now I have another question. Suppose I have a data frame X with lots of > columns/variables: > Name, Age,Group, Type, Salary. > I wanna do a subtotal of salaries: > aggregate(X$Salary, by=list(X$Group,X$Age,X$Type),Fun=mean) > > When the levels of Group and Type are huge, it took R forever to finish the > aggregation. > And I used gc to find that the memory usage was big too. > > However, in mysql, it took seconds to finish a similar job: > select Group,Age,Type ,avg(Salary) from X group by Group,Age,Type > > Is it because mysql is superior in doing such kind of things? Or my R command > is not efficient enough? Why did R have to consume huge memories to do the > aggregation? > > Thanks again! > > Zhihua Li > > _________________________________________________________________ > 天凉了,添衣了,心动了,"七件"了 > http://get.live.cn > [[alternative HTML version deleted]] > > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > -- =============================== WenSui Liu Statistical Project Manager ChoicePoint Precision Marketing (http://spaces.msn.com/statcompute/blog) =============================== ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.