Thanks Dan and Martin for the advice.

It's interesting that this doesn't seem to have come up much before given
our shared obsession with reproducible analyses - I'm sure you'd agree that
(eg) any third party source update should mean a new, explicit dependency
chain for a pipeline - truly reproducible analysis implies consistently
reproducible bugs unfortunately IMHO.

Dan's solution looks do-able - it will give the incantation needed to
package up absolutely specific BioC dependencies and when we write an
installation script, it can clone, compile and install. That will get us
what we need assuming your svn server is happy with the extra load? With
the new toolshed automated dependency mechanism, every Galaxy instance
installing any Galaxy tool that needs a BioC package will hit your svn
server to get a revision specific clone, which may be effective but is a
little inefficient. On that score, for any given BioC package going
forward, given that we won't be freezing anything other than *released*
packages, a mechanism that provides easy to discover and stable urls for
archived released package versions would help lower the load on your svn
server - although if we're the only ones wanting this, I understand that
it's going to be a very low priority for the bioc team.

We'll test this out!


On Tue, Jul 9, 2013 at 2:53 PM, Dan Tenenbaum <dtene...@fhcrc.org> wrote:

> On Mon, Jul 8, 2013 at 9:14 PM, Ross Lazarus
> <ross.laza...@channing.harvard.edu> wrote:
> > Hi, Bioconductor devs,
> >
> > In very rare cases (eg
> >
> http://article.gmane.org/gmane.science.biology.informatics.conductor/35266/match=update+edgeR
> )
> > where BioC package authors have released an urgent bug fix within a given
> > BioC update cycle, the usual automated biocLite installation process does
> > not appear to support recreating a very specific R/Bioc environment
> > containing a precisely specified package release (say the previous edgeR
> > 2.2.0 fixed by 2.2.5). You might argue a user should never do this, but
> > since we want truly reproducible analyses (the context is the new
> toolshed
> > dependency control mechanisms in Galaxy), we need to control *all*
> > dependencies for a given release of a (eg edgeR) wrapper at this very
> fine
> > level of granularity, acknowledging that reproducible =/= valid.
> >
> > Take that edgeR update as a test case. I know about
> > http://bioconductor.org/checkResults/ and eg
> > https://hedgehog.fhcrc.org/bioconductor/trunk/madman/Rpacks/edgeR/ but I
> > have been unable to figure out how to track down the 2.2.0 edgeR archive
> > (or for that matter the 2.2.5 bugfix) - I'm sure it is in svn somewhere.
> > Any advice on how I can identify a long term reliable svn or other url to
> > script the download of a specific (even if know buggy) historical archive
> > of (eg) edgeR 2.2.0?
> >
>
> In the checked-out edgeR working directory, do:
>
> svn log --diff DESCRIPTION > diff.txt
>
> (this requires subversion >= 1.7)
>
> Then look in diff.txt for "Version: 2.2.0". This ends up being revision
> 54800.
>
> Then you can check that out to a different directory with
> svn co -r54800
> https://hedgehog.fhcrc.org/bioconductor/trunk/madman/Rpacks/edgeR/
>
> Dan
>
>
> For more on the topic you bring up, see the thread started by
> https://stat.ethz.ch/pipermail/bioconductor/2013-March/051224.html
>
>
>
> >         [[alternative HTML version deleted]]
> >
> > _______________________________________________
> > Bioc-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/bioc-devel
>

        [[alternative HTML version deleted]]

_______________________________________________
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Reply via email to