Thanks! It turned out that Rmpi was a good option for this problem after all.
Nevetheless, pnmath seems very promising, although it doesn't load in my system: > library(pnmath) Error in dyn.load(file, DLLpath = DLLpath, ...) : unable to load shared library '/home/jpablo/extra/R-271/lib/R/library/pnmath/libs/pnmath.so': libgomp.so.1: shared object cannot be dlopen()ed Error: package/namespace load failed for 'pnmath' I find it odd, because libgomp.so.1 is in /usr/lib, so R should find it. Juan Pablo On Sun, Jun 29, 2008 at 1:36 AM, Martin Morgan <[EMAIL PROTECTED]> wrote: > "Juan Pablo Romero Méndez" <[EMAIL PROTECTED]> writes: > >> Hello, >> >> The problem I'm working now requires to operate on big matrices. >> >> I've noticed that there are some packages that allows to run some >> commands in parallel. I've tried snow and NetWorkSpaces, without much >> success (they are far more slower that the normal functions) > > Do you mean like this? > >> library(Rmpi) >> mpi.spawn.Rslaves(nsl=2) # dual core on my laptop >> m <- matrix(0, 10000, 1000) >> system.time(x1 <- apply(m, 2, sum), gcFirst=TRUE) > user system elapsed > 0.644 0.148 1.017 >> system.time(x2 <- mpi.parApply(m, 2, sum), gcFirst=TRUE) > user system elapsed > 5.188 2.844 10.693 > > ? (This is with Rmpi, a third alternative you did not mention; > 'elapsed' time seems to be relevant here.) > > The basic problem is that the overhead of dividing the matrix up and > communicating between processes outweighs the already-efficient > computation being performed. > > One solution is to organize your code into 'coarse' grains, so the FUN > in apply does (considerably) more work. > > A second approach is to develop a better algorithm / use an > appropriate R paradigm, e.g., > >> system.time(x3 <- colSums(m), gcFirst=TRUE) > user system elapsed > 0.060 0.000 0.088 > > (or even faster, x4 <- rep(0, ncol(m)) ;) > > A third approach, if your calculations make heavy use of linear > algebra, is to build R with a vectorized BLAS library; see the R > Installation and Administration guide. > > A fourth possibility is to use Tierney's 'pnmath' library mentioned in > this thread > > https://stat.ethz.ch/pipermail/r-help/2007-December/148756.html > > The README file needs to be consulted for the not-exactly-trivial (on > my system) task of installing the package. Specific functions are > parallelized, provided the length of the calculation makes it seem > worth-while. > >> system.time(exp(m), gcFirst=TRUE) > user system elapsed > 0.108 0.000 0.106 >> library(pnmath) >> system.time(exp(m), gcFirst=TRUE) > user system elapsed > 0.096 0.004 0.052 > > (elapsed time about 2x faster). Both BLAS and pnmath make much better > use of resources, since they do not require multiple R instances. > > None of these approaches would make a colSums faster -- the work is > just too small for the overhead. > > Martin > >> My problem is very simple, it doesn't require any communication >> between parallel tasks; only that it divides simetricaly the task >> between the available cores. Also, I don't want to run the code in a >> cluster, just my multicore machine (4 cores). >> >> What solution would you propose, given your experience? >> >> Regards, >> >> Juan Pablo >> >> ______________________________________________ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > -- > Martin Morgan > Computational Biology / Fred Hutchinson Cancer Research Center > 1100 Fairview Ave. N. > PO Box 19024 Seattle, WA 98109 > > Location: Arnold Building M2 B169 > Phone: (206) 667-2793 > ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.