On 04/22/2014 09:47 AM, Kasper Daniel Hansen wrote:
I think we should have a CRAN snapshot (or a subset of CRAN used in Bioc)
inside each Bioc release; I don't know how hard that is to manage from a
technical point of view.

I followed this thread with some interest.

It would be surprisingly challenging to update even a 2.13 package -- the build machines have moved on to other tasks, unconstrained by the unique system dependencies needed for 2.13 builds.

The idea of a 'forever' repository snapshot seems possible, but would the snapshot be at the beginning of the release and hence miss the few but important bug fixes introduced during the release, or at the end of the release, which might be after the time required for the purposes of replication? Either way it is certain that the peanut butter would land face down for one's particular need. Also, the need for the user to satisfy system dependencies becomes increasingly challenging, even with a binary repository. I don't think a central 'Bioc' solution would really address the problem of reproducibility.

It is not that 'hard' for an individual group to create a snapshot of Bioc and CRAN, using rsync

  http://www.bioconductor.org/about/mirrors/mirror-how-to/
  http://cran.r-project.org/mirror-howto.html‎

and to use install.packages() or even biocLite to access these (see ?setRepositories). This would again require that the system dependencies for these packages are satisfied in some kind of frozen fashion.

A more robust possibility is of course a virtual machine, such as the AMI (or a customized version) we provide

  http://www.bioconductor.org/help/bioconductor-cloud-ami/#ami_ids

although these have only a subset of packages installed by default.

The CRAN thread referenced earlier included this post

  https://stat.ethz.ch/pipermail/r-devel/2014-March/068605.html

which I think makes an important distinction between exact replication and scientific reproducibility; it is the latter that must be the most interesting, and the former that we somehow seem to stumble over. The thread also mentions best practices -- version control

  http://bioconductor.org/developers/how-to/source-control/

disciplined approach to deprecation

  http://bioconductor.org/developers/how-to/deprecation/

package versioning

  http://bioconductor.org/developers/how-to/version-numbering/

and the Bioc-style approach to release that we as developers can act on to enhance reproducibility. What other best practices can we more forcefully / conveniently adopt within the project?

Martin


Best,
Kasper


On Tue, Apr 22, 2014 at 6:06 PM, Julian Gehring <julian.gehr...@embl.de>wrote:

Hi,

For most problems discussed here, it seems that having a fixed version of
package is sufficient rather than a specific version.  If the idea of a
snapshot with each bioc release would work (which still means one version
per package), so would requiring that version within the package (one would
just need to agree which version this is).

Best wishes

Julian


  what if two Bioc packages require different version of the ‘same’ CRAN
package?
AfaIu, the infrastructure is not designed to deal with multiple versions
of a package.

Nor would I as a user expect to have less-than-the-most recent versions
of CRAN packages in my library just because some other package says so…

Just to throw in another, and probably silly suggestion: the Bioconductor
repository could keep ‘snapshots’ of CRAN packages compatible with each
release, but they would have to be name-mangled in some way. The potential
for confusion is enormous.


_______________________________________________
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


        [[alternative HTML version deleted]]



_______________________________________________
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel



--
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M1 B861
Phone: (206) 667-2793

_______________________________________________
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Reply via email to