Thanks Dirk and Ivan, I took a slightly different work-around of forcing the number of threads to 1 when running functions of the test dataset in the package, by adding the following to each user facing function:
``` # Check if running on package test_data, and if so, force data.table to be # single threaded so that we can avoid a NOTE on CRAN submission if (isTRUE(all.equal(x, ukbnmr::test_data))) { registered_threads <- getDTthreads() setDTthreads(1) on.exit({ setDTthreads(registered_threads) }) # re-register so no unintended side effects for users } ``` (i.e. here x is the input argument to the function) It took some trial and error to get to pass the CRAN tests; the number of columns in the input data was also contributing to the problem. Best, Scott On Mon, 21 Aug 2023 at 14:38, Dirk Eddelbuettel <e...@debian.org> wrote: > > On 21 August 2023 at 16:05, Ivan Krylov wrote: > | Dirk is probably right that it's a good idea to have OMP_THREAD_LIMIT=2 > | set on the CRAN check machine. Either that, or place the responsibility > | on data.table for setting the right number of threads by default. But > | that's a policy question: should a CRAN package start no more than two > | threads/child processes even if it doesn't know it's running in an > | environment where the CPU time / elapsed time limit is two? > > Methinks that given this language in the CRAN Repository Policy > > If running a package uses multiple threads/cores it must never use more > than two simultaneously: the check farm is a shared resource and will > typically be running many checks simultaneously. > > it would indeed be nice if this variable, and/or equivalent ones, were set. > > As I mentioned before, I had long added a similar throttle (not for > data.table) in a package I look after (for work, even). So a similar > throttler with optionality is below. I'll add this to my `dang` package > collecting various functions. > > A usage example follows. It does nothing by default, ensuring 'full power' > but reflects the minimum of two possible options, or an explicit count: > > > dang::limitDataTableCores(verbose=TRUE) > Limiting data.table to '12'. > > Sys.setenv("OMP_THREAD_LIMIT"=3); > dang::limitDataTableCores(verbose=TRUE) > Limiting data.table to '3'. > > options(Ncpus=2); dang::limitDataTableCores(verbose=TRUE) > Limiting data.table to '2'. > > dang::limitDataTableCores(1, verbose=TRUE) > Limiting data.table to '1'. > > > > That makes it, in my eyes, preferable to any unconditional 'always pick 1 > thread'. > > Dirk > > > ##' Set threads for data.table respecting possible local settings > ##' > ##' This function set the number of threads \pkg{data.table} will use > ##' while reflecting two possible machine-specific settings from the > ##' environment variable \sQuote{OMP_THREAD_LIMIT} as well as the R > ##' option \sQuote{Ncpus} (uses e.g. for parallel builds). > ##' @title Set data.table threads respecting default settingss > ##' @param ncores A numeric or character variable with the desired > ##' count of threads to use > ##' @param verbose A logical value with a default of \sQuote{FALSE} to > ##' operate more verbosely > ##' @return The return value of the \pkg{data.table} function > ##' \code{setDTthreads} which is called as a side-effect. > ##' @author Dirk Eddelbuettel > ##' @export > limitDataTableCores <- function(ncores, verbose = FALSE) { > if (missing(ncores)) { > ## start with a simple fallback: 'Ncpus' (if set) or else 2 > ncores <- getOption("Ncpus", 2L) > ## also consider OMP_THREAD_LIMIT (cf Writing R Extensions), gets > NA if envvar unset > ompcores <- as.integer(Sys.getenv("OMP_THREAD_LIMIT")) > ## and then keep the smaller > ncores <- min(na.omit(c(ncores, ompcores))) > } > stopifnot("Package 'data.table' must be installed." = > requireNamespace("data.table", quietly=TRUE)) > stopifnot("Argument 'ncores' must be numeric or character" = > is.numeric(ncores) || is.character(ncores)) > if (verbose) message("Limiting data.table to '", ncores, "'.") > data.table::setDTthreads(ncores) > } > > | > | -- > | Best regards, > | Ivan > | > | ______________________________________________ > | R-package-devel@r-project.org mailing list > | https://stat.ethz.ch/mailman/listinfo/r-package-devel > > -- > dirk.eddelbuettel.com | @eddelbuettel | e...@debian.org > [[alternative HTML version deleted]] ______________________________________________ R-package-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-package-devel