Maciej: There are other packages that query the CRAN site (cranlogs, etc.). So it seems the queries/fetches are generally allowed. I can only find a couple relevant mentions in the CRAN policies:
"Packages which use Internet resources should fail gracefully with an informative message if the resource is not available or has changed (and not give a check warning nor error)." "Downloads of additional software or data as part of package installation or startup should only use secure download mechanisms (e.g., 'https' or 'ftps'). For downloads of more than a few MB, ensure that a sufficiently large timeout is set." So it seems like what you are trying to do would be OK with the appropriate cautions in place. Obviously any test cases are going to have to run fast, or it will get rejected for being too slow to check. Just my reading of the policies. Have never tried it. David -----Original Message----- From: R-package-devel <r-package-devel-boun...@r-project.org> On Behalf Of Maciej Nasinski Sent: Friday, July 16, 2021 6:14 AM To: r-package-devel@r-project.org Subject: [R-pkg-devel] Scrapping R CRAN website from package Dear Sir or Madam, I am creating a new package `pacs` https://github.com/Polkas/pacs, which I want to send to R CRAN shortly. However I am not sure about R CRAN policy regarding scraping CRAN per package page with its archive. More precisely I am fetching the data from https://CRAN.R-project.org/package=%s and https://cran.r-project.org/src/contrib/Archive/%s/ (downloading an old tar.gz too). Why I need this: I could read any DESCRIPTION files for any time point and get a true dependency tree. Moreover I could get a life duration of any released package version, where shorter than 7 days are marked as risky. I could compare a package min required dependencies difference before we update it. And much more. I made a few notices like "Please as a courtesy to the R CRAN, don't overload their server by constantly using this function." inside the package. Optionally If scrapping R CRAN from my package is a problem I will try to build a separate DB with such data (updated everyday). Still any old tar.gz has to be downloaded. Maciej Nasinski, University of Warsaw [[alternative HTML version deleted]] ______________________________________________ R-package-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-package-devel ______________________________________________ R-package-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-package-devel