On 11/17/2012 02:39 AM, Ramon Diaz-Uriarte wrote:
In addition to Steve's comment, is it really a good thing that "all code
stays the same."?  I mean, multiple machines vs. multiple cores are,
often, _very_ different things: for instance, shared vs. distributed
memory, communication overhead differences, whether or not you can assume
packages and objects to be automagically present in the slaves/child
process, etc. So, given they are different situations, I think it
sometimes makes sense to want to write different code for each situation
(I often do); not to mention Steve's hybrid cases ;-).


Since BiocParallel seems to be a major undertaking, maybe it would be
appropriate to provide a flexible approach, instead of hard wiring the
foreach approach.
Of course there are cases where the same code simply can't work for both multicore and multi-machine situations, but those generally don't fall into the category of things that can be done using lapply. Lapply and all of its parallelized buddies like mclapply, parLapply, and foreach are designed for data-parallel operations with no interdependence between results, and these kinds of operations generally parallelize as well across machines as across cores, unless your network is not fast enough (in which case you would choose not to use multi-machine parallelism). If you want a parallel algorithm for something like the disjoin method of GRanges, you might need to write some special purpose code, and that code might be very different for multicore vs multi-machine.

So yes, sometimes there is a fundamental reason that you have to change the code to make it run on multiple machines, and neither foreach nor any other parallelization framework will save you from having to rewrite your code. But often there is no fundamental reason that the code has to change, but you end up changing it anyway because of limitations in your parallelization framework. This is the case that foreach saves you from.

_______________________________________________
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Reply via email to