Is there any write up/discussion/plans on the various types of parallel computations out there:
(1) one machine / multi-core/multi-threaded (2) multiple machines / multiple processes (3) batch / queue processing (on large compute clusters with many users). (4) ... Are we/you mainly focusing on (1) and (2)? /Henrik On Thu, Nov 15, 2012 at 6:21 AM, Kasper Daniel Hansen <kasperdanielhan...@gmail.com> wrote: > I'll second Ryan's patch (at least in principle). When I parallelize > across multiple cores, I have always found mc.preschedule to be an > important option to expose (that, and the number of cores, is all I > use routinely). > > Kasper > > On Wed, Nov 14, 2012 at 7:14 PM, Ryan C. Thompson <r...@thompsonclan.org> > wrote: >> I just submitted a pull request. I'll add tests shortly if I can figure out >> how to write them. >> >> >> On Wed 14 Nov 2012 03:50:36 PM PST, Martin Morgan wrote: >>> >>> On 11/14/2012 03:43 PM, Ryan C. Thompson wrote: >>>> >>>> Here are two alternative implementations of pvec. pvec2 is just a >>>> simple rewrite >>>> of pvec to use mclapply. pvec3 then extends pvec2 to accept a >>>> specified chunk >>>> size or a specified number of chunks. If the number of chunks exceeds >>>> the number >>>> of cores, then multiple chunks will get run sequentially on each >>>> core. pvec3 >>>> also exposes the "mc.prescheule" argument of mclapply, since that is >>>> relevant >>>> when there are more chunks than cores. Lastly, I provide a >>>> "pvectorize" function >>>> which can be called on a regular vectorized function to make it into >>>> a pvec'd >>>> version of itself. Usage is like: sqrt.parallel <- pvectorize(sqrt); >>>> sqrt.parallel(1:1000). >>>> >>>> pvec2 <- function(v, FUN, ..., mc.set.seed = TRUE, mc.silent = FALSE, >>>> mc.cores = getOption("mc.cores", 2L), mc.cleanup = >>>> TRUE) >>>> { >>>> env <- parent.frame() >>>> cores <- as.integer(mc.cores) >>>> if(cores < 1L) stop("'mc.cores' must be >= 1") >>>> if(cores == 1L) return(FUN(v, ...)) >>>> >>>> if(mc.set.seed) mc.reset.stream() >>>> >>>> n <- length(v) >>>> si <- splitIndices(n, cores) >>>> res <- do.call(c, >>>> mclapply(si, function(i) FUN(v[i], ...), >>>> mc.set.seed=mc.set.seed, >>>> mc.silent=mc.silent, >>>> mc.cores=mc.cores, >>>> mc.cleanup=mc.cleanup)) >>>> if (length(res) != n) >>>> warning("some results may be missing, folded or caused an error") >>>> res >>>> } >>>> pvec3 <- function(v, FUN, ..., mc.set.seed = TRUE, mc.silent = FALSE, >>>> mc.cores = getOption("mc.cores", 2L), mc.cleanup = >>>> TRUE, >>>> mc.preschedule=FALSE, num.chunks, chunk.size) >>>> { >>>> env <- parent.frame() >>>> cores <- as.integer(mc.cores) >>>> if(cores < 1L) stop("'mc.cores' must be >= 1") >>>> if(cores == 1L) return(FUN(v, ...)) >>>> >>>> if(mc.set.seed) mc.reset.stream() >>>> >>>> n <- length(v) >>>> if (missing(num.chunks)) { >>>> if (missing(chunk.size)) { >>>> num.chunks <- cores >>>> } else { >>>> num.chunks <- ceiling(n/chunk.size) >>>> } >>>> } >>>> si <- splitIndices(n, num.chunks) >>>> res <- do.call(c, >>>> mclapply(si, function(i) FUN(v[i], ...), >>>> mc.set.seed=mc.set.seed, >>>> mc.silent=mc.silent, >>>> mc.cores=mc.cores, >>>> mc.cleanup=mc.cleanup, >>>> mc.preschedule=mc.preschedule)) >>>> if (length(res) != n) >>>> warning("some results may be missing, folded or caused an error") >>>> res >>>> } >>>> >>>> pvectorize <- function(FUN) { >>>> function(...) pvec3(FUN=FUN, ...) >>>> } >>> >>> >>> would be great to have these as 'pull' requests in github; pvec3 as a >>> replacement for pvec, if it's implementing the same concept but better. >>> >>> Unit tests would be good (yes being a little hypocritical). >>> inst/unitTests, using RUnit, examples in >>> >>> >>> https://hedgehog.fhcrc.org/bioconductor/trunk/madman/Rpacks/IRanges/inst/unitTests >>> >>> >>> with username / password readonly >>> >>> Martin >>> >>>> On Wed 14 Nov 2012 02:23:30 PM PST, Michael Lawrence wrote: >>>>> >>>>> >>>>> On Wed, Nov 14, 2012 at 12:23 PM, Martin Morgan <mtmor...@fhcrc.org> >>>>> wrote: >>>>> >>>>>> >>>>>> Interested developers -- I added the start of a BiocParallel >>>>>> package to >>>>>> the Bioconductor subversion repository and build system. >>>>>> >>>>>> The package is mirrored on github to allow for social coding; I >>>>>> encourage >>>>>> people to contribute via that mechanism. >>>>>> >>>>>> >>>>>> https://github.com/**Bioconductor/BiocParallel<https://github.com/Bioconductor/BiocParallel> >>>>>> >>>>>> >>>>>> >>>>>> The purpose is to help focus our efforts at developing appropriate >>>>>> parallel paradigms. Currently the package Imports: parallel and >>>>>> implements >>>>>> pvec and mclapply in a way that allows for operation on any vector >>>>>> or list >>>>>> supporting length(), [, and [[ (the latter for mclapply). pvec in >>>>>> particular seems to be appropriate for GRanges-like objects, where >>>>>> we don't >>>>>> necessarily want to extract many thousands of S4 instances of >>>>>> individual >>>>>> ranges with [[. >>>>>> >>>>> >>>>> >>>>> Makes sense. Besides, [[ does not even work on GRanges. One >>>>> limitation of >>>>> pvec is that it does not support a chunk size; it just uses length(x) / >>>>> ncores. It would be nice to be able to restrict that, which would then >>>>> require multiple jobs per core. Unless I'm missing something. >>>>> >>>>> >>>>>> >>>>>> >>>>>> Hopefully the ideas in the package can be folded back in to >>>>>> parallel as >>>>>> they mature. >>>>>> >>>>>> Martin >>>>>> -- >>>>>> Dr. Martin Morgan, PhD >>>>>> Fred Hutchinson Cancer Research Center >>>>>> 1100 Fairview Ave. N. >>>>>> PO Box 19024 Seattle, WA 98109 >>>>>> >>>>>> ______________________________**_________________ >>>>>> Bioc-devel@r-project.org mailing list >>>>>> >>>>>> https://stat.ethz.ch/mailman/**listinfo/bioc-devel<https://stat.ethz.ch/mailman/listinfo/bioc-devel> >>>>>> >>>>>> >>>>>> >>>>> >>>>> >>>>> [[alternative HTML version deleted]] >>>>> >>>>> _______________________________________________ >>>>> Bioc-devel@r-project.org mailing list >>>>> https://stat.ethz.ch/mailman/listinfo/bioc-devel >>> >>> >>> >> >> _______________________________________________ >> Bioc-devel@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/bioc-devel > > _______________________________________________ > Bioc-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/bioc-devel _______________________________________________ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel