Re: [Bioc-devel] BiocParallel

Ryan C. Thompson Fri, 16 Nov 2012 20:43:02 -0800

The difference is that in the parallel package, you use mclapply formulticore and parLapply for multi-machine parallelism. If you want toswitch from one to the other, you have to change all your code thatuses either function to the other one. If you use llply(...,.parallel=TRUE), then all you have to do is register a differentbackend, which is one line of code to load the new backend and a secondone to register it, and the rest of your code stays the same.


On Fri 16 Nov 2012 03:24:56 PM PST, Michael Lawrence wrote:



On Fri, Nov 16, 2012 at 11:44 AM, Ryan C. Thompson
<r...@thompsonclan.org <mailto:r...@thompsonclan.org>> wrote:

    You don't have to use foreach directly. I use foreach almost
    exclusively through the plyr package, which uses foreach
    internally to implement parallelism. Like you, I'm not
    particularly fond of the foreach syntax (though it has some nice
    features that come in handy sometimes).

    The appeal of foreach is that it supports pluggable parallelizing
    backends, so you can (in theory) write the same code and
    parallelize it across multiple cores, or across an entire cluster,
    just by plugging in different backends.


But isn't this also possible with the parallel package? It was
inherited from snow. I'd be more in favor of extending the parallel
package, simply because it's part of base R.



    On Fri 16 Nov 2012 10:17:24 AM PST, Michael Lawrence wrote:

        I'm not sure I understand the appeal of foreach. Why not do this
        within the functional paradigm, i.e, parLapply?

        Michael

        On Fri, Nov 16, 2012 at 9:41 AM, Ryan C. Thompson
        <r...@thompsonclan.org <mailto:r...@thompsonclan.org>
        <mailto:r...@thompsonclan.org <mailto:r...@thompsonclan.org>>>
        wrote:

            You could write a %dopar% backend for the foreach package,
        which
            would allow any code using foreach (or plyr which uses
        foreach) to
            parallelize using your code.

            On a related note, it might be nice to add
        Bioconductor-compatible
            versions of foreach and the plyr functions to BiocParallel if
            they're not already compatible.


            On 11/16/2012 12:18 AM, Hahne, Florian wrote:

                I've hacked up some code that uses BatchJobs but makes
        it look
                like a
                normal parLapply operation. Currently the main R
        process is
                checking the
                state of the queue in regular intervals and fetches
        results
                once a job has
                finished. Seems to work quite nicely, although there
        certainly
                are more
                elaborate ways to deal with the synchronous/asynchronous
                issue. Is that
                something that could be interesting for the broader
        audience?
                I could add
                the code to BiocParallel for folks to try it out.
                The whole thing may be a dumb idea, but I find it kind of
                useful to be
                able to start parallel jobs directly from R on our
        huge SGE
                cluster, have
                the calling script wait for all jobs to finish and then
                continue with some
                downstream computations, rather than having to
        manually check
                the job
                status and start another script once the results are
        there.
                Florian


_______________________________________________
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Re: [Bioc-devel] BiocParallel

Reply via email to