[R] Is the aggregate function the best way to do this?

Bert Jacobs Wed, 17 Feb 2010 00:34:01 -0800

Hi,


I'm having a dataframe 'Subset1' with a number of factor variables and 160
numerical variables

Now I want to make sums for all rows that have the same values for the
different factor variables, except for the factor variables: VAR1,VAR2,VAR3
who may have the same values.

With the formula given below this works great, but in a situation with 15000
rows and 13 factor variables the calculation takes more than 2 minutes. 

So my question is: Does anyone knows if there exists a faster alternative? 

 

Subset1.AGG <-
as.data.frame(aggregate(Subset1[,(ncol(Subset1)-159):ncol(Subset1)],
list(VAR1 = Subset1$VAR1,VAR2=Subset1$VAR2,VAR3 = Subset1$VAR3), FUN=sum) )

 

Thank you very much for helping me out,

Bert


        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Is the aggregate function the best way to do this?

Reply via email to