Hi Ale, I'd personally use a more specific solution like github LFS (large file storage) for a versioned database. You should also check with CRAN itself, as they keep high standards for everything that's not a standard install. More specifically (from CRAN policies) :
Downloads of additional software or data as part of package installation or startup should only use secure download mechanisms (e.g., ‘https’ or ‘ftps’). Personally I would store that information in a public database somewhere with a (minimal) API. This can then be extended without inflating the download and would allow people to install only a subset of what they need. That would also allow people to also port your work to other language by simply writing a wrapper around the DB API. It's not a necessity, but I thought it was worth mentioning as an option. Cheers Joris On Wed, Jun 27, 2018 at 10:22 PM, alejandro baranek < alejandrobara...@gmail.com> wrote: > By now, we are on that situation: +- 150 polyhedra published. > But +800 able to publish and because of package size cannot publish all of > them. > > It is not a problem on github, it's a problem on CRAN, with building (fixed > testing timing with simple sample techniques) timing. I would like to hear > more from experienced package developers about this issues, but we seemed > to found a solution. > > We decided to make another github repo RpolyhedraDB. When you install the > package, it downloads the database from the correct tag marked in the data > folder of the package in a home directory of the user. So package will be > minimal for CRAN, will be RR and will install database on first use (In > case of TRAVIS or other qa/continuous integration, it will install it of > course). It will be possible to setup different DB size using the TAGS, in > case we find it preferable to the users. > > > Best, Ale. > > > 2018-03-29 4:43 GMT-03:00 Berry Boessenkool <berryboessenk...@hotmail.com> > : > > > > > I assume you cannot simply reduce the 150 to a few for demonstration > > purposes? > > > > > > I have seen people using DRAT packages on github for data, but gh is > > limited in size restrictions as well... > > > > > > No expert in this, but maybe this helps a little bit... > > > > Berry > > > > > > > > - > > > > > > > > > > > > ------------------------------ > > *From:* R-package-devel <r-package-devel-boun...@r-project.org> on > behalf > > of alejandro baranek <alejandrobara...@gmail.com> > > *Sent:* Tuesday, March 27, 2018 19:26 > > *To:* r-package-devel@r-project.org > > *Subject:* [R-pkg-devel] Questions about making a database package > > (Rpolyhedra) > > > > Hello group: > > > > We released Rpolyhedra V0.2 last month. It is able to scrape +800 > polyhedra > > definitions from public sources. At V0.2.4 we are publishing only 150 > > because the time needed for scrape all the polyhedra, testing and the > > resulting size of the package. The difference is a configuration in > zzz.R, > > very simple to change (Who wants to try it, can build the package for > > themeselves) > > Only the source files of polyhedra definitions are +12MB of size (We are > > including it in the data folder for package self suficience). > > > > But we have doubts about good practices for publishing a database > package. > > > > We think the solution is to split the package in an internal > > Rpolyhedra-lib, opensource but not in CRAN, and Rpolyhedra with a catalog > > sewhich enables to connect with that repo for downloading scraped > polyhedra > > on-demand. > > > > We have to think further the way of connecting both repositories, but > > before touching any code, want to listen to experienced package > developers > > and the community in general, about to do this. > > Do you know any package with analog behavior than this package? We didn't > > find it. > > > > Best, Ale. > > -- > > alejandro baranek > > @ken4rab <https://twitter.com/ken4rab> > > qbotics <http://qbotics.tumblr.com/> | surferinvaders > > <http://surferinvaders.tumblr.com> | algebraic-soundscapes > > <http://imaginary.org/content/algebraic-soundscapes> | surfer-shuffle > > <http://imaginary.org/program/surfer-shuffle> > > > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-package-devel@r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-package-devel > > > > > > -- > alejandro baranek > @ken4rab <https://twitter.com/ken4rab> > qbotics <http://qbotics.tumblr.com/> | surferinvaders > <http://surferinvaders.tumblr.com> | algebraic-soundscapes > <http://imaginary.org/content/algebraic-soundscapes> | surfer-shuffle > <http://imaginary.org/program/surfer-shuffle> > > [[alternative HTML version deleted]] > > ______________________________________________ > R-package-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-package-devel > -- Joris Meys Statistical consultant Department of Data Analysis and Mathematical Modelling Ghent University Coupure Links 653, B-9000 Gent (Belgium) <https://maps.google.com/?q=Coupure+links+653,%C2%A0B-9000+Gent,%C2%A0Belgium&entry=gmail&source=g> tel: +32 (0)9 264 61 79 ----------- Biowiskundedagen 2017-2018 http://www.biowiskundedagen.ugent.be/ ------------------------------- Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php [[alternative HTML version deleted]] ______________________________________________ R-package-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-package-devel