Hi, all.

Zenodo does offer storage (I believe limited to 50GB per submission) and is 
backed by CERN with a guarantee of storage for at least 20 years (the life of 
CERN, could be extended).

I agree that Github is a viable alternative to Zenodo and one can use the two 
together easily. On github, one can use three different approaches: 1) check 
files into version control, 2) use git lfs, and 3) as release artifacts. Each 
has pros and cons.

If data are of a biomedical nature, one can ofter deposit in a biomedical 
repository, including one of dozens at NIH and EBI. NIH even recommends some 
open “generalist repositories” that include zenodo and OSF: 
https://www.nlm.nih.gov/NIHbmic/generalist_repositories.html

If one is looking to cater to the machine learning/AI community, hosting on 
huggingface is another option. Doing so is quite similar to hosting on github 
from a purely practical perspective.

Cloud storage systems such as AWS, GCP, and Azure are possibilities, but egress 
charges can be challenging to predict. Cloudflare R2 is s3-compatible and has 
no egress charges, making it a good choice for sharing particularly large files.

On the client side, Bioconductor has BiocFileCache which is a client-side 
package for caching files that have been downloaded. Other file download/cache 
packages are available, though I’m less familiar with them.

Just wanted to expand the list a bit.

Sean


From: R-package-devel <r-package-devel-boun...@r-project.org> on behalf of Dirk 
Eddelbuettel <e...@debian.org>
Date: Saturday, February 15, 2025 at 10:29 AM
To: Simon Urbanek <simon.urba...@r-project.org>
Cc: R-package-devel@r-project.org <R-package-devel@r-project.org>
Subject: Re: [R-pkg-devel] Retrieving versioned csv datasets for use in an R 
package

On 15 February 2025 at 19:50, Simon Urbanek wrote:
| Github is not reliable enough for reproducible research (your files can
| disappear at any point - or can change without notice),

I'm curious: Do you have a concrete example of a no-longer-reproducible study
whose data or other support files changed and thereby caused this breakage?

| that's why Zenodo was created.

But AFAIK Zenodo offers DOI issuance only, not storage (as, say, OSF would).
So this does not address the problem faced by the OP.

Dirk

--
dirk.eddelbuettel.com | @eddelbuettel | e...@debian.org

______________________________________________
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel

        [[alternative HTML version deleted]]

______________________________________________
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel

Reply via email to