To be more specific, instead of:
library(parallel)
cl <- ... # Make a cluster
parLapply(cl, X, fun, ...)
you can do:
library(parallel)
library(doParallel)
library(plyr)
cl <- ...
registerDoParallel(cl)
llply(X, fun, ..., .parallel=TRUE)
On Fri 16 Nov 2012 11:44:06 AM PST, Ryan C. Thompson wrote:
You don't have to use foreach directly. I use foreach almost
exclusively through the plyr package, which uses foreach internally to
implement parallelism. Like you, I'm not particularly fond of the
foreach syntax (though it has some nice features that come in handy
sometimes).
The appeal of foreach is that it supports pluggable parallelizing
backends, so you can (in theory) write the same code and parallelize
it across multiple cores, or across an entire cluster, just by
plugging in different backends.
On Fri 16 Nov 2012 10:17:24 AM PST, Michael Lawrence wrote:
I'm not sure I understand the appeal of foreach. Why not do this
within the functional paradigm, i.e, parLapply?
Michael
On Fri, Nov 16, 2012 at 9:41 AM, Ryan C. Thompson
<r...@thompsonclan.org <mailto:r...@thompsonclan.org>> wrote:
You could write a %dopar% backend for the foreach package, which
would allow any code using foreach (or plyr which uses foreach) to
parallelize using your code.
On a related note, it might be nice to add Bioconductor-compatible
versions of foreach and the plyr functions to BiocParallel if
they're not already compatible.
On 11/16/2012 12:18 AM, Hahne, Florian wrote:
I've hacked up some code that uses BatchJobs but makes it look
like a
normal parLapply operation. Currently the main R process is
checking the
state of the queue in regular intervals and fetches results
once a job has
finished. Seems to work quite nicely, although there certainly
are more
elaborate ways to deal with the synchronous/asynchronous
issue. Is that
something that could be interesting for the broader audience?
I could add
the code to BiocParallel for folks to try it out.
The whole thing may be a dumb idea, but I find it kind of
useful to be
able to start parallel jobs directly from R on our huge SGE
cluster, have
the calling script wait for all jobs to finish and then
continue with some
downstream computations, rather than having to manually check
the job
status and start another script once the results are there.
Florian
_______________________________________________
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel