Let me add my opinion: we do not have perfect (easy) reproducibility with
Bioc because we can only (easily) download the tar ball corresponding to
the latest commit in a given branch.  I am ok with that.  What I (and
Alejandro) is concerned about is the inability to install even that.

There is a clear candidate for which version of the CRAN package we should
store: the version we use when we run R CMD check.  This is the version we
implicitly say things are working with.

Best,
Kasper


On Fri, Apr 25, 2014 at 7:41 AM, Hervé Pagès <hpa...@fhcrc.org> wrote:

> Hi,
>
> See the latest software builds for BioC 2.13:
>
>   http://bioconductor.org/checkResults/2.13/bioc-20140405/
>
> The number of packages that needed to be installed on the build
> system in order to build and check the 750 BioC software packages
> is displayed in the right-most column of the top table:
>
>   1510 on zin1 (Linux)
>   1486 on moscato1 (Windows)
>   1500 on perceval (Mac)
>
> If you click on these numbers, you get the full list of packages
> plus their version.
>
> Once you've subtracted the 750 software packages + the number of data
> annotation and data experiment packages (a few more hundreds) from
> these numbers, that gives you the number of CRAN packages that
> BioC 2.13 depends on. Not that many really (only a very small fraction
> of the 5400 CRAN packages).
>
> If we hosted only this small subset of CRAN packages under
>
>   http://bioconductor.org/packages/2.13/cran
>
> next to the other 4 frozen repos
>
>   http://bioconductor.org/packages/2.13/bioc
>   http://bioconductor.org/packages/2.13/data/annotation
>   http://bioconductor.org/packages/2.13/experiment
>   http://bioconductor.org/packages/2.13/extra
>
> and have biocLite() modified to point to
>
>    http://bioconductor.org/packages/2.13/cran
>
> instead of
>
>   http://cran.fhcrc.org
>
> then anybody that has R 3.0.3 could *easily* install and run
> BioC 2.13 now or in 5 years from now.
>
> Cheers,
> H.
>
>
>
> On 04/24/2014 08:09 AM, Steve Lianoglou wrote:
>
>> Hi all,
>>
>> Just saw this tangentially related link to "packrat" which seems something
>> analogous to a virtualenv (of sorts) for R by the Rstudio folks, which I
>> thought might be useful
>>
>> It actually doesn't solve anybody's problem here, but as I said ...
>> tangential :-)
>>
>> http://rstudio.github.io/packrat/
>>
>>
>> On Thursday, April 24, 2014, Wolfgang Huber <whu...@embl.de> wrote:
>>
>>  Hi Kasper
>>>
>>> you are right, I had misunderstood the problem.
>>> In that case I agree with Martin that the problem resolves into
>>> components
>>> that are either intractable, already addressed by deprecation policies,
>>> or
>>> not very important.
>>> Sorry for the noise.
>>>
>>>          Wolfgang
>>>
>>> On 24 Apr 2014, at 15:18, Kasper Daniel Hansen <
>>> kasperdanielhan...@gmail.com> wrote:
>>>
>>>  Wolfgang,
>>>>
>>>> Alejandro did not have a problem with the current release, but with the
>>>>
>>> most recent prior release.  His issue is precisely because it is no
>>> longer
>>> the current (stable) release.
>>>
>>>>
>>>> Kasper
>>>>
>>>>
>>>> On Thu, Apr 24, 2014 at 3:05 PM, Wolfgang Huber <whu...@embl.de> wrote:
>>>> Hi Martin
>>>> to come back to the original trigger for this thread: it was not
>>>>
>>> concerns for reproducibility, but the fact that a Bioc package in the
>>> current release stopped working because a CRAN package has changed in the
>>> meanwhile.
>>>
>>>> What's the most practical solution to this specific problem?
>>>>          Best wishes
>>>>          Wolfgang
>>>>
>>>>
>>>>
>>>>
>>>> On 23 Apr 2014, at 19:41, Martin Morgan <mtmor...@fhcrc.org> wrote:
>>>>
>>>>  On 04/22/2014 09:47 AM, Kasper Daniel Hansen wrote:
>>>>>
>>>>>> I think we should have a CRAN snapshot (or a subset of CRAN used in
>>>>>>
>>>>> Bioc)
>>>
>>>> inside each Bioc release; I don't know how hard that is to manage
>>>>>>
>>>>> from a
>>>
>>>> technical point of view.
>>>>>>
>>>>>
>>>>> I followed this thread with some interest.
>>>>>
>>>>> It would be surprisingly challenging to update even a 2.13 package --
>>>>>
>>>> the build machines have moved on to other tasks, unconstrained by the
>>> unique system dependencies needed for 2.13 builds.
>>>
>>>>
>>>>> The idea of a 'forever' repository snapshot seems possible, but would
>>>>>
>>>> the snapshot be at the beginning of the release and hence miss the few
>>> but
>>> important bug fixes introduced during the release, or at the end of the
>>> release, which might be after the time required for the purposes of
>>> replication? Either way it is certain that the peanut butter would land
>>> face down for one's particular need. Also, the need for the user to
>>> satisfy
>>> system dependencies becomes increasingly challenging, even with a binary
>>> repository. I don't think a central 'Bioc' solution would really address
>>> the problem of reproducibility.
>>>
>>>>
>>>>> It is not that 'hard' for an individual group to create a snapshot of
>>>>>
>>>> Bioc and CRAN, using rsync
>>>
>>>>
>>>>>   http://www.bioconductor.org/about/mirrors/mirror-how-to/
>>>>>   http://cran.r-project.org/mirror-howto.html
>>>>>
>>>>> and to use install.packages() or even biocLite to access these (see
>>>>>
>>>> ?setRepositories). This would again require that the system dependencies
>>> for these packages are satisfied in some kind of frozen fashion.
>>>
>>>>
>>>>> A more robust possibility is of course a virtual machine, such as the
>>>>>
>>>> AMI (or a customized version) we provide
>>>
>>>>
>>>>>   http://www.bioconductor.org/help/bioconductor-cloud-ami/#ami_ids
>>>>>
>>>>> although these have only a subset of packages installed by default.
>>>>>
>>>>> The CRAN thread referenced earlier included this post
>>>>>
>>>>>   https://stat.ethz.ch/pipermail/r-devel/2014-March/068605.html
>>>>>
>>>>> which I think makes an important distinction between exact replication
>>>>>
>>>> and scientific reproducibility; it is the latter that must be the most
>>> interesting, and the former that we somehow seem to stumble over. The
>>> thread also mentions best practices -- version control
>>>
>>>>
>>>>>   http://bioconductor.org/developers/how-to/source-control/
>>>>>
>>>>> disciplined approach to deprecation
>>>>>
>>>>>   http://bioconductor.org/developers/how-to/deprecation/
>>>>>
>>>>> package versioning
>>>>>
>>>>>   http://bioconductor.org/developers/how-to/version-numbering/
>>>>>
>>>>> and the Bioc-style approach to release that we as developers can act
>>>>>
>>>> on to enhance reproducibility. What other best pract
>>>
>>
>>
>>
>>
> --
> Hervé Pagès
>
> Program in Computational Biology
> Division of Public Health Sciences
>
> Fred Hutchinson Cancer Research Center
> 1100 Fairview Ave. N, M1-B514
> P.O. Box 19024
> Seattle, WA 98109-1024
>
> E-mail: hpa...@fhcrc.org
> Phone:  (206) 667-5791
> Fax:    (206) 667-1319
>
>
> _______________________________________________
> Bioc-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>

        [[alternative HTML version deleted]]

_______________________________________________
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Reply via email to