Hi, You might look at Reduce(). It seems faster. I converted the matrix to a list in an incredibly sloppy way (which you should not emulate) because I cannot think of the simple way.
> probs <- t(matrix(rep(1:10000000), nrow=10)) # matrix with row-wise > probabilites > F <- matrix(0, nrow=nrow(probs), ncol=ncol(probs)); > F[,1] <- probs[,1,drop=TRUE]; > add <- function(x) {Reduce(`+`, x, accumulate = TRUE)} > > > system.time(F.slow <- t(apply(probs, 1, cumsum))) user system elapsed 36.758 0.416 42.464 > > system.time(for (cc in 2:ncol(F)) { + F[,cc] <- F[,cc-1,drop=TRUE] + probs[,cc,drop=TRUE]; + }) user system elapsed 0.980 0.196 1.328 > > system.time(add(list(probs[,1], probs[,2], probs[,3], probs[,4], probs[,5], > probs[,6], probs[,7], probs[,8], probs[,9], probs[,10]))) user system elapsed 0.420 0.072 0.539 > .539 seconds for 10 vectors each with 1 million elements does not seem bad. For 30000 each, on my system it took .014 seconds, which for 200,000 iterations, I estimated as: > (200000*.014)/60/60 [1] 0.7777778 (and this is on a stone age system with a single core processor and only 756MB of RAM) It should not be difficult to get the list back to a matrix. Cheers, Josh On Thu, Oct 14, 2010 at 11:51 PM, Gregor <mailingl...@gmx.at> wrote: > Dear all, > > Maybe the "easiest" solution: Is there anything that speaks against > generalizing > cumsum from base to cope with matrices (as is done in matlab)? E.g.: > > "cumsum(Matrix, 1)" > equivalent to > "apply(Matrix, 1, cumsum)" > > The main advantage could be optimized code if the Matrix is extreme nonsquare > (e.g. 100,000x10), but the summation is done over the short side (in this > case 10). > apply would practically yield a loop over 100,000 elements, and vectorization > w.r.t. > the long side (loop over 10 elements) provides considerable efficiency gains. > > Many regards, > Gregor > > > > > On Tue, 12 Oct 2010 10:24:53 +0200 > Gregor <mailingl...@gmx.at> wrote: > >> Dear all, >> >> I am struggling with a (currently) cost-intensive problem: calculating the >> (non-normalized) cumulative distribution function, given the (non-normalized) >> probabilities. something like: >> >> probs <- t(matrix(rep(1:100),nrow=10)) # matrix with row-wise probabilites >> F <- t(apply(probs, 1, cumsum)) #SLOOOW! >> >> One (already faster, but for sure not ideal) solution - thanks to Henrik >> Bengtsson: >> >> F <- matrix(0, nrow=nrow(probs), ncol=ncol(probs)); >> F[,1] <- probs[,1,drop=TRUE]; >> for (cc in 2:ncol(F)) { >> F[,cc] <- F[,cc-1,drop=TRUE] + probs[,cc,drop=TRUE]; >> } >> >> In my case, probs is a (30,000 x 10) matrix, and i need to iterate this step >> around >> 200,000 times, so speed is crucial. I currently can make sure to have no >> NAs, but >> in order to extend matrixStats, this could be a nontrivial issue. >> >> Any ideas for speeding up this - probably routine - task? >> >> Thanks in advance, >> Gregor >> >> ______________________________________________ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/ ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.