Re: [Bioc-devel] BiocParallel

Ryan C. Thompson Wed, 14 Nov 2012 15:55:38 -0800

Here are two alternative implementations of pvec. pvec2 is just a simplerewrite of pvec to use mclapply. pvec3 then extends pvec2 to accept aspecified chunk size or a specified number of chunks. If the number ofchunks exceeds the number of cores, then multiple chunks will get runsequentially on each core. pvec3 also exposes the "mc.prescheule"argument of mclapply, since that is relevant when there are more chunksthan cores. Lastly, I provide a "pvectorize" function which can becalled on a regular vectorized function to make it into a pvec'd versionof itself. Usage is like: sqrt.parallel <- pvectorize(sqrt);sqrt.parallel(1:1000).


pvec2 <- function(v, FUN, ..., mc.set.seed = TRUE, mc.silent = FALSE,
                  mc.cores = getOption("mc.cores", 2L), mc.cleanup = TRUE)
{
  env <- parent.frame()
  cores <- as.integer(mc.cores)
  if(cores < 1L) stop("'mc.cores' must be >= 1")
  if(cores == 1L) return(FUN(v, ...))


  if(mc.set.seed) mc.reset.stream()

  n <- length(v)
  si <- splitIndices(n, cores)
  res <- do.call(c,
                 mclapply(si, function(i) FUN(v[i], ...),
                          mc.set.seed=mc.set.seed,
                          mc.silent=mc.silent,
                          mc.cores=mc.cores,
                          mc.cleanup=mc.cleanup))
  if (length(res) != n)
    warning("some results may be missing, folded or caused an error")
  res
}

pvec3 <- function(v, FUN, ..., mc.set.seed = TRUE, mc.silent = FALSE,
                  mc.cores = getOption("mc.cores", 2L), mc.cleanup = TRUE,
                  mc.preschedule=FALSE, num.chunks, chunk.size)
{
  env <- parent.frame()
  cores <- as.integer(mc.cores)
  if(cores < 1L) stop("'mc.cores' must be >= 1")
  if(cores == 1L) return(FUN(v, ...))

  if(mc.set.seed) mc.reset.stream()

  n <- length(v)
  if (missing(num.chunks)) {
    if (missing(chunk.size)) {
      num.chunks <- cores
    } else {
      num.chunks <- ceiling(n/chunk.size)
    }
  }
  si <- splitIndices(n, num.chunks)
  res <- do.call(c,
                 mclapply(si, function(i) FUN(v[i], ...),
                          mc.set.seed=mc.set.seed,
                          mc.silent=mc.silent,
                          mc.cores=mc.cores,
                          mc.cleanup=mc.cleanup,
                          mc.preschedule=mc.preschedule))
  if (length(res) != n)
    warning("some results may be missing, folded or caused an error")
  res
}

pvectorize <- function(FUN) {
  function(...) pvec3(FUN=FUN, ...)
}

On Wed 14 Nov 2012 02:23:30 PM PST, Michael Lawrence wrote:

On Wed, Nov 14, 2012 at 12:23 PM, Martin Morgan <mtmor...@fhcrc.org>wrote:


Interested developers -- I added the start of a BiocParallel package to
the Bioconductor subversion repository and build system.

The package is mirrored on github to allow for social coding; I encourage
people to contribute via that mechanism.

https://github.com/**Bioconductor/BiocParallel<https://github.com/Bioconductor/BiocParallel>

The purpose is to help focus our efforts at developing appropriate

parallel paradigms. Currently the package Imports: parallel andimplementspvec and mclapply in a way that allows for operation on any vector orlist

supporting length(), [, and [[ (the latter for mclapply). pvec in

particular seems to be appropriate for GRanges-like objects, where wedon't

necessarily want to extract many thousands of S4 instances of individual
ranges with [[.



Makes sense. Besides, [[ does not even work on GRanges. One limitation of
pvec is that it does not support a chunk size; it just uses length(x) /
ncores. It would be nice to be able to restrict that, which would then
require multiple jobs per core. Unless I'm missing something.



Hopefully the ideas in the package can be folded back in to parallel as
they mature.

Martin
--
Dr. Martin Morgan, PhD
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

______________________________**_________________
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/**listinfo/bioc-devel<https://stat.ethz.ch/mailman/listinfo/bioc-devel>



[[alternative HTML version deleted]]

_______________________________________________
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


_______________________________________________
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Re: [Bioc-devel] BiocParallel

Reply via email to