Gabriel,

Thanks for the clarification. I was avoiding depending on CodeDepends 
because I'm fairly certain that a BioC package can't depend on a package 
that isn't in either CRAN or Bioconductor. Since you point out that the 
librarySymbols code doesn't depend on any other part of the package, I 
think it would be fine to copy it into BiocParallel and use it to check 
functions for external dependencies, if that's what you're suggesting. 
Of course, we would add a comment noting that once CodeDepends makes it 
into CRAN, we should switch over to using that.

Side note 1: If we're talking about doing sanity checks on code, we 
could also check for any usage of non-local assignment ("<<-"), since we 
know that will have no effect in the subprocess, and the user might not 
expect that if they are not familiar with multi-process parallelism.

Side note 2: Your original link gave a 404 error because it had the word 
"Note" appended to it. Removing this gave a valid link: 
https://github.com/duncantl/CodeDepends/blob/forCRAN_0.3.5/R/librarySymbols.R

-Ryan


On 11/4/13, 12:13 PM, Gabriel Becker wrote:
> Ryan,
>
> I agree that in some sense it is a different problem, but my point is 
> with a different approach we can easily answer both. The code I posted 
> returns a named character vector of symbol names with package name 
> being the name.
>
> This makes it a trivial lookup to determine both a) what symbols 
> aren't available in any of the packages and b) what packages provide 
> the remaining required symbols. No extra work required.
>
> You do have to give it a list of packages to check, but it is easy to 
> write a wrapper that automatically passes it all currently attached 
> packages if desired (a combination of search() and gsub() would be a 
> quick and dirty way to do this).
>
> All that said, I'm simply trying to help. If you guys don't want to 
> use my code/approach that is your perogative as I'm not currently 
> working on BiocParallel myself.
>
> ~G
>
>
>
>
> On Mon, Nov 4, 2013 at 11:54 AM, Ryan Thompson <r...@thompsonclan.org 
> <mailto:r...@thompsonclan.org>> wrote:
>
>     The code that I wrote intentionally avoids checking for package
>     variables, since I consider that a separate problem. Package
>     variables can be provided to the child by leading the package,
>     whereas user-defined variables must be serialized in the parent
>     and sent to the child.
>
>     I think I could fairly easily adapt the same code to return a list
>     of all packages that a function depends on.
>
>     -Ryan
>
>     On Nov 4, 2013 11:35 AM, "Michael Lawrence"
>     <lawrence.mich...@gene.com <mailto:lawrence.mich...@gene.com>> wrote:
>
>         The dynamic nature of R limits the extent of these checks. But
>         as Ryan has
>         noted, a simple sanity check goes a long way. If what he has
>         done could be
>         extended to the rest of the search path (people always forget
>         to attach
>         packages), I think we've hit the 80% with 20%. Got a 404 on
>         that URL btw.
>
>         Michael
>
>
>         On Mon, Nov 4, 2013 at 11:05 AM, Gabriel Becker
>         <gmbec...@ucdavis.edu <mailto:gmbec...@ucdavis.edu>>wrote:
>
>         > Hey guys,
>         >
>         > Here is code that I have written which resolves library
>         names into a full
>         > list of symbols:
>         >
>         >
>         
> https://github.com/duncantl/CodeDepends/blob/forCRAN_0.3.5/R/librarySymbols.RNote
>         > this does not require that the packages actually be loaded
>         at the time
>         > of the check, and does not load them (or rather, it loads
>         them but does not
>         > attach them, so no searchpath muddying occurs). You do need
>         a list of
>         > packages to check though (it adds the base ones
>         automatically). It handles
>         > dependency and could be easily extended to handle suggests
>         as well I think.
>         >
>         > When CodeDepends gets pushed to cran (not my call and not
>         high on my
>         > priority list to push for currently) it will actually do
>         exactly what you
>         > want. (the forCRAN_0.3.5 branch already does and I believe it is
>         > documented, so you could use devtools to install it now).
>         >
>         > As a side note, I'm not sure that existence of a symbol is
>         sufficient (it
>         > certainly is necessary). What about situations where the
>         symbol exists but
>         > is stale compared to the value in the parent? Are we sure
>         that can never
>         > happen?
>         >
>         > ~G
>         >
>         >
>         > On Mon, Nov 4, 2013 at 7:29 AM, Michel Lang
>         <michell...@gmail.com <mailto:michell...@gmail.com>> wrote:
>         >
>         > > You might want to consider using Recall() for recursion
>         which should
>         > solve
>         > > this. Determining the required variables using heuristics
>         as codetools
>         > will
>         > > probably lead to some confusion when using functions which
>         include calls
>         > > to, e.g., with():
>         > >
>         > > f = function() {
>         > >   with(iris, Sepal.Length + Sepal.Width)
>         > > }
>         > > codetools:::findGlobals(f)
>         > >
>         > > I would suggest to write up some documentation on what the
>         function's
>         > > environment contains and how to to define variables
>         accordingly - or why
>         > it
>         > > can generally be considered a good idea to pass everything
>         essential as
>         > an
>         > > argument. Nevertheless a "bpExport" function would be a
>         good addition for
>         > > some rare corner cases in my opinion.
>         > >
>         > > Michel
>         > >
>         > >
>         > > 2013/11/3 Henrik Bengtsson <h...@biostat.ucsf.edu
>         <mailto:h...@biostat.ucsf.edu>>
>         > >
>         > > > Hi,
>         > > >
>         > > > in BiocParallel, is there a suggested (or planned) best
>         standards for
>         > > > making *locally* assigned variables (e.g. functions)
>         available to the
>         > > > applied function when it runs in a separate R process
>         (which will be
>         > > > the most common use case)?  I understand that avoid
>         local variables
>         > > > should be avoided and it's preferred to put as mush as
>         possible in
>         > > > packages, but that's not always possible or very convenient.
>         > > >
>         > > > EXAMPLE:
>         > > >
>         > > > library('BiocParallel')
>         > > > library('BatchJobs')
>         > > >
>         > > > # Here I pick a recursive functions to make the problem
>         a bit harder,
>         > > i.e.
>         > > > # the function needs to call itself ("itself" = see below)
>         > > > fib <- function(n=0) {
>         > > >   if (n < 0) stop("Invalid 'n': ", n)
>         > > >   if (n == 0 || n == 1) return(1)
>         > > >   fib(n-2) + fib(n-1)
>         > > > }
>         > > >
>         > > > # Executing in the current R session
>         > > > cluster.functions <- makeClusterFunctionsInteractive()
>         > > > bpParams <-
>         BatchJobsParam(cluster.functions=cluster.functions)
>         > > > register(bpParams)
>         > > > values <- bplapply(0:9, FUN=fib)
>         > > > ## SubmitJobs |++++++++++++++++++++++++++++++++++| 100%
>         (00:00:00)
>         > > > ## Waiting [S:0 R:0 D:10 E:0] |+++++++++++++++++++| 100%
>         (00:00:00)
>         > > >
>         > > >
>         > > > # Executing in a separate R process, where fib() is not
>         defined
>         > > > # (not specific to BiocParallel)
>         > > > cluster.functions <- makeClusterFunctionsLocal()
>         > > > bpParams <-
>         BatchJobsParam(cluster.functions=cluster.functions)
>         > > > register(bpParams)
>         > > > values <- bplapply(0:9, FUN=fib)
>         > > > ## SubmitJobs |++++++++++++++++++++++++++++++++++| 100%
>         (00:00:00)
>         > > > ## Waiting [S:0 R:0 D:10 E:0] |+++++++++++++++++++| 100%
>         (00:00:00)
>         > > > Error in LastError$store(results = results, is.error = !ok,
>         > throw.error =
>         > > > TRUE)
>         > > > :
>         > > >   Errors occurred during execution. First error message:
>         > > > Error in FUN(...): could not find function "fib"
>         > > > [...]
>         > > >
>         > > >
>         > > > # The following illustrates that the solution is not always
>         > > > straightforward.
>         > > > # (not specific to BiocParallel; must have been
>         discussed previously)
>         > > > values <- bplapply(0:9, FUN=function(n, fib) {
>         > > >   fib(n)
>         > > > }, fib=fib)
>         > > > Error in LastError$store(results = results, is.error = !ok,
>         > > > throw.error = TRUE) :
>         > > >   Errors occurred during execution. First error message:
>         > > > Error in fib(n): could not find function "fib"
>         > > > [...]
>         > > >
>         > > > # Workaround; make fib() aware of itself
>         > > > # (this is something the user need to do, and would be very
>         > > > #  hard for BiocParallel et al. to automate.  BTW,
>         should all
>         > > > #  recursive functions be implemented this way?).
>         > > > fib <- function(n=0) {
>         > > >   if (n < 0) stop("Invalid 'n': ", n)
>         > > >   if (n == 0 || n == 1) return(1)
>         > > >   fib <- sys.function() # Make function aware of itself
>         > > >   fib(n-2) + fib(n-1)
>         > > > }
>         > > > values <- bplapply(0:9, FUN=function(n, fib) {
>         > > >   fib(n)
>         > > > }, fib=fib)
>         > > >
>         > > >
>         > > > WISHLIST:
>         > > > Considering the above recursive issue solved, a slightly
>         more explicit
>         > > > and standardized solution is then:
>         > > >
>         > > > values <- bplapply(0:9, FUN=function(n, BPGLOBALS=NULL) {
>         > > >   for (name in names(BPGLOBALS)) assign(name,
>         BPGLOBALS[[name]])
>         > > >   fib(n)
>         > > > }, BPGLOBALS=list(fib=fib))
>         > > >
>         > > > Could the above be generalized into something as neat as:
>         > > >
>         > > > bpExport("fib")
>         > > > values <- bplapply(0:9, FUN=function(n) {
>         > > >   BiocParallel::bpImport("fib")
>         > > >   fib(n)
>         > > > })
>         > > >
>         > > > or ideally just (analogously to parallel::clusterExport()):
>         > > >
>         > > > bpExport("fib")
>         > > > values <- bplapply(0:9, FUN=fib)
>         > > >
>         > > > /Henrik
>         > > >
>         > > > _______________________________________________
>         > > > Bioc-devel@r-project.org
>         <mailto:Bioc-devel@r-project.org> mailing list
>         > > > https://stat.ethz.ch/mailman/listinfo/bioc-devel
>         > > >
>         > >
>         > >         [[alternative HTML version deleted]]
>         > >
>         > > _______________________________________________
>         > > Bioc-devel@r-project.org <mailto:Bioc-devel@r-project.org>
>         mailing list
>         > > https://stat.ethz.ch/mailman/listinfo/bioc-devel
>         > >
>         >
>         >
>         >
>         > --
>         > Gabriel Becker
>         > Graduate Student
>         > Statistics Department
>         > University of California, Davis
>         >
>         >         [[alternative HTML version deleted]]
>         >
>         > _______________________________________________
>         > Bioc-devel@r-project.org <mailto:Bioc-devel@r-project.org>
>         mailing list
>         > https://stat.ethz.ch/mailman/listinfo/bioc-devel
>         >
>
>                 [[alternative HTML version deleted]]
>
>         _______________________________________________
>         Bioc-devel@r-project.org <mailto:Bioc-devel@r-project.org>
>         mailing list
>         https://stat.ethz.ch/mailman/listinfo/bioc-devel
>
>
>
>
> -- 
> Gabriel Becker
> Graduate Student
> Statistics Department
> University of California, Davis


        [[alternative HTML version deleted]]

_______________________________________________
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Reply via email to