"Kevin B. Hendricks" <[EMAIL PROTECTED]> writes: > My first R attempt was a simple > > # sort the data.frame gd and the sort key > sorder <- order(MDPC) > gd <- gd[sorder,] > MDPC <- MDPC[sorder] > attach(gd) > > # find the length and sum for each unique sort key > XN <- by(MVE, MDPC, length) > XSUM <- by(MVE, MDPC, sum) > GRPS <- levels(as.factor(MDPC)) > > Well the ordering and sorting was reasonably fast but the first "by" > statement was still running 4 hours later on my machine (a dual 2.6 > gig Opteron with 4 gig of main memory). This same snippet of code in > SAS running on a slower machine takes about 5 minutes of system > time.
I wonder if split() would be of use here. Once you have sorted the data frame gd and the sort keys MDPC, you could do: gdList <- split(gd$MVE, MDPC) xn <- sapply(gdList, length) xsum <- sapply(gdList, sum) + seth ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel