Gabriel, Thanks for the clarification. I was avoiding depending on CodeDepends because I'm fairly certain that a BioC package can't depend on a package that isn't in either CRAN or Bioconductor. Since you point out that the librarySymbols code doesn't depend on any other part of the package, I think it would be fine to copy it into BiocParallel and use it to check functions for external dependencies, if that's what you're suggesting. Of course, we would add a comment noting that once CodeDepends makes it into CRAN, we should switch over to using that.
Side note 1: If we're talking about doing sanity checks on code, we could also check for any usage of non-local assignment ("<<-"), since we know that will have no effect in the subprocess, and the user might not expect that if they are not familiar with multi-process parallelism. Side note 2: Your original link gave a 404 error because it had the word "Note" appended to it. Removing this gave a valid link: https://github.com/duncantl/CodeDepends/blob/forCRAN_0.3.5/R/librarySymbols.R -Ryan On 11/4/13, 12:13 PM, Gabriel Becker wrote: > Ryan, > > I agree that in some sense it is a different problem, but my point is > with a different approach we can easily answer both. The code I posted > returns a named character vector of symbol names with package name > being the name. > > This makes it a trivial lookup to determine both a) what symbols > aren't available in any of the packages and b) what packages provide > the remaining required symbols. No extra work required. > > You do have to give it a list of packages to check, but it is easy to > write a wrapper that automatically passes it all currently attached > packages if desired (a combination of search() and gsub() would be a > quick and dirty way to do this). > > All that said, I'm simply trying to help. If you guys don't want to > use my code/approach that is your perogative as I'm not currently > working on BiocParallel myself. > > ~G > > > > > On Mon, Nov 4, 2013 at 11:54 AM, Ryan Thompson <r...@thompsonclan.org > <mailto:r...@thompsonclan.org>> wrote: > > The code that I wrote intentionally avoids checking for package > variables, since I consider that a separate problem. Package > variables can be provided to the child by leading the package, > whereas user-defined variables must be serialized in the parent > and sent to the child. > > I think I could fairly easily adapt the same code to return a list > of all packages that a function depends on. > > -Ryan > > On Nov 4, 2013 11:35 AM, "Michael Lawrence" > <lawrence.mich...@gene.com <mailto:lawrence.mich...@gene.com>> wrote: > > The dynamic nature of R limits the extent of these checks. But > as Ryan has > noted, a simple sanity check goes a long way. If what he has > done could be > extended to the rest of the search path (people always forget > to attach > packages), I think we've hit the 80% with 20%. Got a 404 on > that URL btw. > > Michael > > > On Mon, Nov 4, 2013 at 11:05 AM, Gabriel Becker > <gmbec...@ucdavis.edu <mailto:gmbec...@ucdavis.edu>>wrote: > > > Hey guys, > > > > Here is code that I have written which resolves library > names into a full > > list of symbols: > > > > > > https://github.com/duncantl/CodeDepends/blob/forCRAN_0.3.5/R/librarySymbols.RNote > > this does not require that the packages actually be loaded > at the time > > of the check, and does not load them (or rather, it loads > them but does not > > attach them, so no searchpath muddying occurs). You do need > a list of > > packages to check though (it adds the base ones > automatically). It handles > > dependency and could be easily extended to handle suggests > as well I think. > > > > When CodeDepends gets pushed to cran (not my call and not > high on my > > priority list to push for currently) it will actually do > exactly what you > > want. (the forCRAN_0.3.5 branch already does and I believe it is > > documented, so you could use devtools to install it now). > > > > As a side note, I'm not sure that existence of a symbol is > sufficient (it > > certainly is necessary). What about situations where the > symbol exists but > > is stale compared to the value in the parent? Are we sure > that can never > > happen? > > > > ~G > > > > > > On Mon, Nov 4, 2013 at 7:29 AM, Michel Lang > <michell...@gmail.com <mailto:michell...@gmail.com>> wrote: > > > > > You might want to consider using Recall() for recursion > which should > > solve > > > this. Determining the required variables using heuristics > as codetools > > will > > > probably lead to some confusion when using functions which > include calls > > > to, e.g., with(): > > > > > > f = function() { > > > with(iris, Sepal.Length + Sepal.Width) > > > } > > > codetools:::findGlobals(f) > > > > > > I would suggest to write up some documentation on what the > function's > > > environment contains and how to to define variables > accordingly - or why > > it > > > can generally be considered a good idea to pass everything > essential as > > an > > > argument. Nevertheless a "bpExport" function would be a > good addition for > > > some rare corner cases in my opinion. > > > > > > Michel > > > > > > > > > 2013/11/3 Henrik Bengtsson <h...@biostat.ucsf.edu > <mailto:h...@biostat.ucsf.edu>> > > > > > > > Hi, > > > > > > > > in BiocParallel, is there a suggested (or planned) best > standards for > > > > making *locally* assigned variables (e.g. functions) > available to the > > > > applied function when it runs in a separate R process > (which will be > > > > the most common use case)? I understand that avoid > local variables > > > > should be avoided and it's preferred to put as mush as > possible in > > > > packages, but that's not always possible or very convenient. > > > > > > > > EXAMPLE: > > > > > > > > library('BiocParallel') > > > > library('BatchJobs') > > > > > > > > # Here I pick a recursive functions to make the problem > a bit harder, > > > i.e. > > > > # the function needs to call itself ("itself" = see below) > > > > fib <- function(n=0) { > > > > if (n < 0) stop("Invalid 'n': ", n) > > > > if (n == 0 || n == 1) return(1) > > > > fib(n-2) + fib(n-1) > > > > } > > > > > > > > # Executing in the current R session > > > > cluster.functions <- makeClusterFunctionsInteractive() > > > > bpParams <- > BatchJobsParam(cluster.functions=cluster.functions) > > > > register(bpParams) > > > > values <- bplapply(0:9, FUN=fib) > > > > ## SubmitJobs |++++++++++++++++++++++++++++++++++| 100% > (00:00:00) > > > > ## Waiting [S:0 R:0 D:10 E:0] |+++++++++++++++++++| 100% > (00:00:00) > > > > > > > > > > > > # Executing in a separate R process, where fib() is not > defined > > > > # (not specific to BiocParallel) > > > > cluster.functions <- makeClusterFunctionsLocal() > > > > bpParams <- > BatchJobsParam(cluster.functions=cluster.functions) > > > > register(bpParams) > > > > values <- bplapply(0:9, FUN=fib) > > > > ## SubmitJobs |++++++++++++++++++++++++++++++++++| 100% > (00:00:00) > > > > ## Waiting [S:0 R:0 D:10 E:0] |+++++++++++++++++++| 100% > (00:00:00) > > > > Error in LastError$store(results = results, is.error = !ok, > > throw.error = > > > > TRUE) > > > > : > > > > Errors occurred during execution. First error message: > > > > Error in FUN(...): could not find function "fib" > > > > [...] > > > > > > > > > > > > # The following illustrates that the solution is not always > > > > straightforward. > > > > # (not specific to BiocParallel; must have been > discussed previously) > > > > values <- bplapply(0:9, FUN=function(n, fib) { > > > > fib(n) > > > > }, fib=fib) > > > > Error in LastError$store(results = results, is.error = !ok, > > > > throw.error = TRUE) : > > > > Errors occurred during execution. First error message: > > > > Error in fib(n): could not find function "fib" > > > > [...] > > > > > > > > # Workaround; make fib() aware of itself > > > > # (this is something the user need to do, and would be very > > > > # hard for BiocParallel et al. to automate. BTW, > should all > > > > # recursive functions be implemented this way?). > > > > fib <- function(n=0) { > > > > if (n < 0) stop("Invalid 'n': ", n) > > > > if (n == 0 || n == 1) return(1) > > > > fib <- sys.function() # Make function aware of itself > > > > fib(n-2) + fib(n-1) > > > > } > > > > values <- bplapply(0:9, FUN=function(n, fib) { > > > > fib(n) > > > > }, fib=fib) > > > > > > > > > > > > WISHLIST: > > > > Considering the above recursive issue solved, a slightly > more explicit > > > > and standardized solution is then: > > > > > > > > values <- bplapply(0:9, FUN=function(n, BPGLOBALS=NULL) { > > > > for (name in names(BPGLOBALS)) assign(name, > BPGLOBALS[[name]]) > > > > fib(n) > > > > }, BPGLOBALS=list(fib=fib)) > > > > > > > > Could the above be generalized into something as neat as: > > > > > > > > bpExport("fib") > > > > values <- bplapply(0:9, FUN=function(n) { > > > > BiocParallel::bpImport("fib") > > > > fib(n) > > > > }) > > > > > > > > or ideally just (analogously to parallel::clusterExport()): > > > > > > > > bpExport("fib") > > > > values <- bplapply(0:9, FUN=fib) > > > > > > > > /Henrik > > > > > > > > _______________________________________________ > > > > Bioc-devel@r-project.org > <mailto:Bioc-devel@r-project.org> mailing list > > > > https://stat.ethz.ch/mailman/listinfo/bioc-devel > > > > > > > > > > [[alternative HTML version deleted]] > > > > > > _______________________________________________ > > > Bioc-devel@r-project.org <mailto:Bioc-devel@r-project.org> > mailing list > > > https://stat.ethz.ch/mailman/listinfo/bioc-devel > > > > > > > > > > > -- > > Gabriel Becker > > Graduate Student > > Statistics Department > > University of California, Davis > > > > [[alternative HTML version deleted]] > > > > _______________________________________________ > > Bioc-devel@r-project.org <mailto:Bioc-devel@r-project.org> > mailing list > > https://stat.ethz.ch/mailman/listinfo/bioc-devel > > > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioc-devel@r-project.org <mailto:Bioc-devel@r-project.org> > mailing list > https://stat.ethz.ch/mailman/listinfo/bioc-devel > > > > > -- > Gabriel Becker > Graduate Student > Statistics Department > University of California, Davis [[alternative HTML version deleted]] _______________________________________________ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel