On 19 Sep 2017, at 12:30 pm, Aaron Lun <a...@wehi.edu.au> wrote:
Well, inertia won out in the end, and so I've just moved a whole
stack of packages into "Suggests" for now. This is probably not a
sustainable solution as the workflow can potentially get larger over
time; I would prefer to have some formal support for splitting up
the workflow into modules that can be independently installed.
-Aaron
________________________________
From: Vincent Carey <st...@channing.harvard.edu>
Sent: Saturday, 16 September 2017 10:08:13 PM
To: Aaron Lun
Cc: Martin Morgan; bioc-devel@r-project.org
Subject: Re: [Bioc-devel] [Untrusted Server]Re: strange error in
Jenkins build forsingleCellWorkflow
IMHO the pedagogic value of a unified document that treats a topic
thoroughly
is quite high. Building the whole workflow on an arbitrary user's
system seems to
me to be a lower priority. Thus using the environment variable in
the build system
to avoid this limit seems an appropriate solution.
On Sat, Sep 16, 2017 at 7:43 AM, Aaron Lun
<a...@wehi.edu.au<mailto:a...@wehi.edu.au>> wrote:
Thanks Martin. Yes, it's quite unfortunate that scater drags in
dplyr and ggplot2, which - combined with Bioconductor's core
packages - already puts us pretty close to the limit without doing
anything else!
A solution might be to split my workflow into self-contained
components, each of which can become its own workflow package (e.g.,
simpleSingleCell1, simpleSingleCell2, simpleSingleCell3 and so on).
This should avoid all of the problems and our associated hacks.
I'm happy to do this, but is it possible for the website to indicate
that there is a connection between the component workflows? For
example, the link that ordinarily goes to the compiled workflow
could instead go to an indexing page, which contains links to
individual component workflows.
-Aaron
________________________________
From: Martin Morgan
<martin.mor...@roswellpark.org<mailto:martin.mor...@roswellpark.org>>
Sent: Saturday, 16 September 2017 8:18:09 PM
To: Aaron Lun;
bioc-devel@r-project.org<mailto:bioc-devel@r-project.org>
Subject: Re: [Bioc-devel] [Untrusted Server]Re: strange error in
Jenkins build forsingleCellWorkflow
On 09/16/2017 01:53 AM, Aaron Lun wrote:
Bumping this rather old thread. To re-iterate, I'm updating my
simpleSingleCell workflow and I'm running into R's DLL limit. I've
added a code block halfway through the workflow that unloads all
DLLs and cleans them out, and this works fine during compilation on
my local machine.
However, it seems that the BioC workflow builder uses a
pre-processing step whereby it first tries to load all packages
contained within library() calls. This hits the DLL limit as it
doesn't execute the protective code block, which defeats the
purpose of all my fiddling in the first place.
What options are there? I'm happy to split my workflow into
multiple smaller Rmarkdown files that get compiled separately,
provided there is appropriate support for this setup from the build
system
The workflows have been standardized as packages. The packages put the
workflow dependencies in the 'Depends:' field, with the idea being that
the user installing the workflow package 'in the usual way' will get
the
packages used in the vignette installed in their system 'in the usual
way' without having to execute special variants of biocLite() /
install.packages() / funky code in the vignette itself to be able to
build the vignette.
Loading a package loads its Depends: (and Imports:) so triggers the
problem.
Writing separate vignettes would not help with this (but might make the
workflow more palatable; I'm not 100% sure of support for separate work
flows in a single package, there is no problem with having multiple
workflow packages on the same general topic).
One could move (some?) packages to Suggests: and use your trick of
unloading packages part-way through the vignette. But then users will
find that they need to install packages to complete the vignette.
'We' could add a support for a BBS option that increases
R_MAX_NUM_DLLS,
but that would allow the workflow to build on the build system, but not
on the users' system.
I think also the R-core approach to this
(https://stat.ethz.ch/pipermail/r-devel/2016-December/073529.html,
https://github.com/wch/r-source/commit/757bfa1d7ff373a604d6d34617f9cad78e0c875e)
is a little insightful, where one could imagine increasing the default
R_MAX_NUM_DLLS, but apparently on some OS these compete for number of
open files, and this in turn can be quite low.
I note that users have already struggled with the DLL problem 'in the
wild' https://stackoverflow.com/a/45552926/547331. This seems
particularly problematic for workflows, which are appealing to
relatively novice users.
At the end of the day I think the workflows should make realistic
use of
R resources. I think this means modifying the workflow to use fewer
DLLs. (this general comment is relevant to other workflows, which for
instance start by downloading very large data sets -- I know that less
constrained use of computing resources is supposed to be a selling
point
of the workflows, but in excess this seems counter-productive to their
primary use as pedagogic tools [rather than, for instance,
comprehensive
exemplars of reproducible research]).
Maybe there is additional discussion about some of the technical
aspects
of workflows that others might contribute.
Martin
Cheers
Aaron
________________________________
From: Bioc-devel
<bioc-devel-boun...@r-project.org<mailto:bioc-devel-boun...@r-project.org>>
on behalf of Aaron Lun <a...@wehi.edu.au<mailto:a...@wehi.edu.au>>
Sent: Wednesday, 21 June 2017 12:09:13 AM
To: bioc-devel@r-project.org<mailto:bioc-devel@r-project.org>
Subject: [Untrusted Server]Re: [Bioc-devel] strange error in
Jenkins build forsingleCellWorkflow
Hi all,
I'm getting a curious error in the Jenkins log when I try to build
the singleCellWorkflow:
http://docbuilder.bioconductor.org:8080/job/simpleSingleCell/48/label=master/console
The key part is at the bottom:
Error: package or namespace load failed for 'GenomicFeatures' in
dyn.load(file, DLLpath = DLLpath, ...):
unable to load shared object
'/var/lib/jenkins/R/x86_64-pc-linux-gnu-library/3.4/Rsamtools/libs/Rsamtools.so':
`maximal number of DLLs reached...
The workflow had previously been running fine on the build system;
I'm not quite sure what's going on here, given that it's not even
failing at the point where I made the latest changes.
Cheers,
Aaron
[[alternative HTML version deleted]]
_______________________________________________
Bioc-devel@r-project.org<mailto:Bioc-devel@r-project.org> mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel
[[alternative HTML version deleted]]
_______________________________________________
Bioc-devel@r-project.org<mailto:Bioc-devel@r-project.org> mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel
This email message may contain legally privileged and/or
confidential information. If you are not the intended recipient(s),
or the employee or agent responsible for the delivery of this
message to the intended recipient(s), you are hereby notified that
any disclosure, copying, distribution, or use of this email message
is prohibited. If you have received this message in error, please
notify the sender immediately by e-mail and delete this email
message from your computer. Thank you.
[[alternative HTML version deleted]]
_______________________________________________
Bioc-devel@r-project.org<mailto:Bioc-devel@r-project.org> mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel
[[alternative HTML version deleted]]
_______________________________________________
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel