"Juan Pablo Romero Méndez" <[EMAIL PROTECTED]> writes: > Thanks! > > It turned out that Rmpi was a good option for this problem after all. > > Nevetheless, pnmath seems very promising, although it doesn't load in my > system: > > >> library(pnmath) > Error in dyn.load(file, DLLpath = DLLpath, ...) : > unable to load shared library > '/home/jpablo/extra/R-271/lib/R/library/pnmath/libs/pnmath.so': > libgomp.so.1: shared object cannot be dlopen()ed > Error: package/namespace load failed for 'pnmath'
Yes, in the pnmath README it says On Redhat EL 5 I have run into a problem where attempting to dlopen libgomp.so fails. A workaround is to link R.bin with -lgomp. This is not an issue on Fedora 7, so probably will go away at some point. This is the problem you encountered. I think (out of my depth here) that the issue is here to stay, rather than something unique to RHEL 5. The somewhat cryptic solution is 'to link R.bin with -lgomp'. I hesitate to give public advice on the black art of configuring R, but I translate that to mean building R with % cd somedir % LIBS=-lgomp ~/path/to/R-source/configure % make -j4 I don't know what the deeper issues are to doing things this way. Martin > I find it odd, because libgomp.so.1 is in /usr/lib, so R should find it. > > > Juan Pablo > > > On Sun, Jun 29, 2008 at 1:36 AM, Martin Morgan <[EMAIL PROTECTED]> wrote: >> "Juan Pablo Romero Méndez" <[EMAIL PROTECTED]> writes: >> >>> Hello, >>> >>> The problem I'm working now requires to operate on big matrices. >>> >>> I've noticed that there are some packages that allows to run some >>> commands in parallel. I've tried snow and NetWorkSpaces, without much >>> success (they are far more slower that the normal functions) >> >> Do you mean like this? >> >>> library(Rmpi) >>> mpi.spawn.Rslaves(nsl=2) # dual core on my laptop >>> m <- matrix(0, 10000, 1000) >>> system.time(x1 <- apply(m, 2, sum), gcFirst=TRUE) >> user system elapsed >> 0.644 0.148 1.017 >>> system.time(x2 <- mpi.parApply(m, 2, sum), gcFirst=TRUE) >> user system elapsed >> 5.188 2.844 10.693 >> >> ? (This is with Rmpi, a third alternative you did not mention; >> 'elapsed' time seems to be relevant here.) >> >> The basic problem is that the overhead of dividing the matrix up and >> communicating between processes outweighs the already-efficient >> computation being performed. >> >> One solution is to organize your code into 'coarse' grains, so the FUN >> in apply does (considerably) more work. >> >> A second approach is to develop a better algorithm / use an >> appropriate R paradigm, e.g., >> >>> system.time(x3 <- colSums(m), gcFirst=TRUE) >> user system elapsed >> 0.060 0.000 0.088 >> >> (or even faster, x4 <- rep(0, ncol(m)) ;) >> >> A third approach, if your calculations make heavy use of linear >> algebra, is to build R with a vectorized BLAS library; see the R >> Installation and Administration guide. >> >> A fourth possibility is to use Tierney's 'pnmath' library mentioned in >> this thread >> >> https://stat.ethz.ch/pipermail/r-help/2007-December/148756.html >> >> The README file needs to be consulted for the not-exactly-trivial (on >> my system) task of installing the package. Specific functions are >> parallelized, provided the length of the calculation makes it seem >> worth-while. >> >>> system.time(exp(m), gcFirst=TRUE) >> user system elapsed >> 0.108 0.000 0.106 >>> library(pnmath) >>> system.time(exp(m), gcFirst=TRUE) >> user system elapsed >> 0.096 0.004 0.052 >> >> (elapsed time about 2x faster). Both BLAS and pnmath make much better >> use of resources, since they do not require multiple R instances. >> >> None of these approaches would make a colSums faster -- the work is >> just too small for the overhead. >> >> Martin >> >>> My problem is very simple, it doesn't require any communication >>> between parallel tasks; only that it divides simetricaly the task >>> between the available cores. Also, I don't want to run the code in a >>> cluster, just my multicore machine (4 cores). >>> >>> What solution would you propose, given your experience? >>> >>> Regards, >>> >>> Juan Pablo >>> >>> ______________________________________________ >>> R-help@r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >> >> -- >> Martin Morgan >> Computational Biology / Fred Hutchinson Cancer Research Center >> 1100 Fairview Ave. N. >> PO Box 19024 Seattle, WA 98109 >> >> Location: Arnold Building M2 B169 >> Phone: (206) 667-2793 >> -- Martin Morgan Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M2 B169 Phone: (206) 667-2793 ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.