Hi Alejandro, Brooke Anderson gave a nice talk at useR!2017 addressing this exact issue. See https://schd.ws/hosted_files/user2017/19/anderson-eddelbuettel-use_r_talk.pdf for the slides. The basic idea is to use an external CRAN-like repository for the data back-end. Brooke used 'drat' to set up such a repo.
-Mark Op do 28 jun. 2018 om 13:56 schreef alejandro baranek < alejandrobara...@gmail.com>: > Hi Joris: > > Thank you for your comments. > Of course, we are using https for aditional downloads. > > For the moment it is not needed to use github LFS, but is an alternative we > can explore after this short step: our immediate goal is to make the > package lighter in CRAN. Now it's 35kb so I think we made it well. > > We are defining an XSD for exporting polyhedra in XML. After that, it will > be possible to make an API with the polyhedra database and make the > improvement you are saying. But with time, we have no funding yet for this > project and want to implement some functionalities to make it more valuable > first. But is in our roadmap to make it easy to port it to other languages. > The interface we are using is really simple, probably it will be the API > interface too. > > Best, Ale. > > > 2018-06-28 5:23 GMT-03:00 Joris Meys <joris.m...@ugent.be>: > > > Hi Ale, > > > > I'd personally use a more specific solution like github LFS (large file > > storage) for a versioned database. You should also check with CRAN > itself, > > as they keep high standards for everything that's not a standard install. > > More specifically (from CRAN policies) : > > > > Downloads of additional software or data as part of package installation > > or startup should only use secure download mechanisms (e.g., ‘https’ or > > ‘ftps’). > > > > Personally I would store that information in a public database somewhere > > with a (minimal) API. This can then be extended without inflating the > > download and would allow people to install only a subset of what they > need. > > That would also allow people to also port your work to other language by > > simply writing a wrapper around the DB API. It's not a necessity, but I > > thought it was worth mentioning as an option. > > > > Cheers > > Joris > > > > On Wed, Jun 27, 2018 at 10:22 PM, alejandro baranek < > > alejandrobara...@gmail.com> wrote: > > > >> By now, we are on that situation: +- 150 polyhedra published. > >> But +800 able to publish and because of package size cannot publish all > of > >> them. > >> > >> It is not a problem on github, it's a problem on CRAN, with building > >> (fixed > >> testing timing with simple sample techniques) timing. I would like to > hear > >> more from experienced package developers about this issues, but we > seemed > >> to found a solution. > >> > >> We decided to make another github repo RpolyhedraDB. When you install > the > >> package, it downloads the database from the correct tag marked in the > data > >> folder of the package in a home directory of the user. So package will > be > >> minimal for CRAN, will be RR and will install database on first use (In > >> case of TRAVIS or other qa/continuous integration, it will install it of > >> course). It will be possible to setup different DB size using the TAGS, > in > >> case we find it preferable to the users. > >> > >> > >> Best, Ale. > >> > >> > >> 2018-03-29 4:43 GMT-03:00 Berry Boessenkool < > berryboessenk...@hotmail.com > >> >: > >> > >> > > >> > I assume you cannot simply reduce the 150 to a few for demonstration > >> > purposes? > >> > > >> > > >> > I have seen people using DRAT packages on github for data, but gh is > >> > limited in size restrictions as well... > >> > > >> > > >> > No expert in this, but maybe this helps a little bit... > >> > > >> > Berry > >> > > >> > > >> > > >> > - > >> > > >> > > >> > > >> > > >> > > >> > ------------------------------ > >> > *From:* R-package-devel <r-package-devel-boun...@r-project.org> on > >> behalf > >> > of alejandro baranek <alejandrobara...@gmail.com> > >> > *Sent:* Tuesday, March 27, 2018 19:26 > >> > *To:* r-package-devel@r-project.org > >> > *Subject:* [R-pkg-devel] Questions about making a database package > >> > >> > (Rpolyhedra) > >> > > >> > Hello group: > >> > > >> > We released Rpolyhedra V0.2 last month. It is able to scrape +800 > >> polyhedra > >> > definitions from public sources. At V0.2.4 we are publishing only 150 > >> > because the time needed for scrape all the polyhedra, testing and the > >> > resulting size of the package. The difference is a configuration in > >> zzz.R, > >> > very simple to change (Who wants to try it, can build the package for > >> > themeselves) > >> > Only the source files of polyhedra definitions are +12MB of size (We > are > >> > including it in the data folder for package self suficience). > >> > > >> > But we have doubts about good practices for publishing a database > >> package. > >> > > >> > We think the solution is to split the package in an internal > >> > Rpolyhedra-lib, opensource but not in CRAN, and Rpolyhedra with a > >> catalog > >> > sewhich enables to connect with that repo for downloading scraped > >> polyhedra > >> > on-demand. > >> > > >> > We have to think further the way of connecting both repositories, but > >> > before touching any code, want to listen to experienced package > >> developers > >> > and the community in general, about to do this. > >> > Do you know any package with analog behavior than this package? We > >> didn't > >> > find it. > >> > > >> > Best, Ale. > >> > -- > >> > alejandro baranek > >> > @ken4rab <https://twitter.com/ken4rab> > >> > qbotics <http://qbotics.tumblr.com/> | surferinvaders > >> > <http://surferinvaders.tumblr.com> | algebraic-soundscapes > >> > <http://imaginary.org/content/algebraic-soundscapes> | surfer-shuffle > >> > <http://imaginary.org/program/surfer-shuffle> > >> > > >> > [[alternative HTML version deleted]] > >> > > >> > ______________________________________________ > >> > R-package-devel@r-project.org mailing list > >> > https://stat.ethz.ch/mailman/listinfo/r-package-devel > >> > > >> > >> > >> > >> -- > >> alejandro baranek > >> @ken4rab <https://twitter.com/ken4rab> > >> qbotics <http://qbotics.tumblr.com/> | surferinvaders > >> <http://surferinvaders.tumblr.com> | algebraic-soundscapes > >> <http://imaginary.org/content/algebraic-soundscapes> | surfer-shuffle > >> <http://imaginary.org/program/surfer-shuffle> > >> > >> [[alternative HTML version deleted]] > >> > >> ______________________________________________ > >> R-package-devel@r-project.org mailing list > >> https://stat.ethz.ch/mailman/listinfo/r-package-devel > >> > > > > > > > > -- > > Joris Meys > > Statistical consultant > > > > Department of Data Analysis and Mathematical Modelling > > Ghent University > > Coupure Links 653, B-9000 Gent (Belgium) > > > > < > https://maps.google.com/?q=Coupure+links+653,%C2%A0B-9000+Gent,%C2%A0Belgium&entry=gmail&source=g > > > > > > tel: +32 (0)9 264 61 79 <+32%209%20264%2061%2079> > > ----------- > > Biowiskundedagen 2017-2018 > > http://www.biowiskundedagen.ugent.be/ > > > > ------------------------------- > > Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php > > > > > > -- > alejandro baranek > @ken4rab <https://twitter.com/ken4rab> > qbotics <http://qbotics.tumblr.com/> | surferinvaders > <http://surferinvaders.tumblr.com> | algebraic-soundscapes > <http://imaginary.org/content/algebraic-soundscapes> | surfer-shuffle > <http://imaginary.org/program/surfer-shuffle> > > [[alternative HTML version deleted]] > > ______________________________________________ > R-package-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-package-devel > [[alternative HTML version deleted]] ______________________________________________ R-package-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-package-devel