On Fri, Dec 7, 2018 at 2:48 AM Martin Maechler <maech...@stat.math.ethz.ch> wrote: > > >>>>> Hadley Wickham > >>>>> on Thu, 6 Dec 2018 10:22:47 -0600 writes: > > > Hi all, > > I'd love to get some clarification on what the new internet policy > > means for packages like httr: > > >> Packages which use Internet resources should fail gracefully with an > informative > >> message if the resource is not available (and not give a check warning > nor error). > > > It's not clear what "internet resource" means here? If it means > > dataset, then I think the httr tests and examples are ok. If it means > > any use of the internet, I'm not sure what do - httr critically > > depends on internet access, so I can't see any way to make it fail > > gracefully. > > > Hadley > > I cannot answer your question, notably as I'm not part of the CRAN > team, but as R Core developer, I've encountered the problem > many times which this policy tries to mitigate > (but I also think we should consider to go further than the > above "policy") : > > As R developer, I'd like to see the effect of a change to the > sources of base R, and so eventually, I may want to run the > equivalent of 'R CMD check' on all existing CRAN and > Bioconductor packages. If I have access to a server with many > cores and very fast hard disks, I can hope to finish running the > tests with 1--2 days. > But then I have to deal with the result. The few times I've > done this, the result has been "a mess" because many many > packages nowadays assume in their examples and their regression/unit > tests that internet access to some resources works, ... which it > "often" does not, and so download.file(), > read.table("http://.....") etc result in errors sooner or later. > > Because of that some packages fail their checks "randomly" (in > the sense that internet resources are not available "randomly"). > Ideally we'd find a very good way that these failures are > communicated back to the person / process running (a version of) > 'R CMD check', because in the above scenario, I'd like to weed > out the 300 packages that just failed because of internet > resource access failures, and only look at the other packages > that got a change in their 'R CMD check' results.
We have now decent tooling for this in revdepcheck (https://www.github.com/r-lib/revdepcheck, planning for CRAN submission next year). After performing all the revdepchecks, you can run revdep_add_broken() to recheck packages that failed in the previous round - in my experience testing httr (whose revdeps obviously use the internet a lot) this resolves most of the randomness (since it's fairly unlikely to get two random failures in a row). My main concern about making the checking in examples and tests stricter is that I think the primary result is that people will simply do less testing and write fewer realistic examples, which is a net negative for the community. When you want people to do the right thing, I think you have to provide a carrot along with the stick. > The recent introduction in R-devel of classed error conditions > (in some cases), e.g., > https://developer.r-project.org/blosxom.cgi/R-devel/NEWS/2018/10/04#n2018-10-04, > and the similar and somewhat earlier > effort of Lionel Henry to use classed error conditions (in > rlang only, unfortunately, rather than as a patch proposal to R ..) > maybe one step towards a nice solution here. We'd be happy to propose a patch to base R, but it's not yet clear to us exactly how things should work so I think it makes the most sense to first prototype in a package. Hadley -- http://hadley.nz ______________________________________________ R-package-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-package-devel