Thanks Dan and Martin for the advice. It's interesting that this doesn't seem to have come up much before given our shared obsession with reproducible analyses - I'm sure you'd agree that (eg) any third party source update should mean a new, explicit dependency chain for a pipeline - truly reproducible analysis implies consistently reproducible bugs unfortunately IMHO.
Dan's solution looks do-able - it will give the incantation needed to package up absolutely specific BioC dependencies and when we write an installation script, it can clone, compile and install. That will get us what we need assuming your svn server is happy with the extra load? With the new toolshed automated dependency mechanism, every Galaxy instance installing any Galaxy tool that needs a BioC package will hit your svn server to get a revision specific clone, which may be effective but is a little inefficient. On that score, for any given BioC package going forward, given that we won't be freezing anything other than *released* packages, a mechanism that provides easy to discover and stable urls for archived released package versions would help lower the load on your svn server - although if we're the only ones wanting this, I understand that it's going to be a very low priority for the bioc team. We'll test this out! On Tue, Jul 9, 2013 at 2:53 PM, Dan Tenenbaum <dtene...@fhcrc.org> wrote: > On Mon, Jul 8, 2013 at 9:14 PM, Ross Lazarus > <ross.laza...@channing.harvard.edu> wrote: > > Hi, Bioconductor devs, > > > > In very rare cases (eg > > > http://article.gmane.org/gmane.science.biology.informatics.conductor/35266/match=update+edgeR > ) > > where BioC package authors have released an urgent bug fix within a given > > BioC update cycle, the usual automated biocLite installation process does > > not appear to support recreating a very specific R/Bioc environment > > containing a precisely specified package release (say the previous edgeR > > 2.2.0 fixed by 2.2.5). You might argue a user should never do this, but > > since we want truly reproducible analyses (the context is the new > toolshed > > dependency control mechanisms in Galaxy), we need to control *all* > > dependencies for a given release of a (eg edgeR) wrapper at this very > fine > > level of granularity, acknowledging that reproducible =/= valid. > > > > Take that edgeR update as a test case. I know about > > http://bioconductor.org/checkResults/ and eg > > https://hedgehog.fhcrc.org/bioconductor/trunk/madman/Rpacks/edgeR/ but I > > have been unable to figure out how to track down the 2.2.0 edgeR archive > > (or for that matter the 2.2.5 bugfix) - I'm sure it is in svn somewhere. > > Any advice on how I can identify a long term reliable svn or other url to > > script the download of a specific (even if know buggy) historical archive > > of (eg) edgeR 2.2.0? > > > > In the checked-out edgeR working directory, do: > > svn log --diff DESCRIPTION > diff.txt > > (this requires subversion >= 1.7) > > Then look in diff.txt for "Version: 2.2.0". This ends up being revision > 54800. > > Then you can check that out to a different directory with > svn co -r54800 > https://hedgehog.fhcrc.org/bioconductor/trunk/madman/Rpacks/edgeR/ > > Dan > > > For more on the topic you bring up, see the thread started by > https://stat.ethz.ch/pipermail/bioconductor/2013-March/051224.html > > > > > [[alternative HTML version deleted]] > > > > _______________________________________________ > > Bioc-devel@r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/bioc-devel > [[alternative HTML version deleted]] _______________________________________________ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel