Hi, netters,

First of all, thanks a lot for all the prompt replies to my earlier question 
about "merging" data frames in R.
Actually that's an equivalence to the "join" clause in mysql.

Now I have another question. Suppose I have a data frame X with lots of 
columns/variables:
Name, Age,Group, Type, Salary.
I wanna do a subtotal of salaries:
aggregate(X$Salary, by=list(X$Group,X$Age,X$Type),Fun=mean)

When the levels of Group and Type are huge, it took R forever to finish the 
aggregation.
And I used gc to find that the memory usage was big too.

However, in mysql, it took seconds to finish a similar job:
select Group,Age,Type ,avg(Salary)  from X group by  Group,Age,Type

Is it because mysql is superior in doing such kind of things? Or my R command 
is not efficient enough? Why did R have to consume huge memories to do the 
aggregation?

Thanks again!

Zhihua Li

_________________________________________________________________
ÌìÁ¹ÁË£¬ÌíÒÂÁË£¬Ð͝ÁË£¬¡°Æß¼þ¡±ÁË 
http://get.live.cn
        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to