Hello. As already pointed out, the current R implementation treats any non-empty value on _R_CHECK_LIMIT_CORES_ different from "false" as a true value, e.g. "TRUE", "true", "T", "1", but also "donald duck". Using '--as-cran' sets _R_CHECK_LIMIT_CORES_="TRUE", if unset. If already set, it'll not touch it. So, it could be that a CRAN check server already uses, say, _R_CHECK_LIMIT_CORES_="true". We cannot make assumptions about that.
To make your life, and an end-user's too, easier, I suggest just using num_workers <- 2L without conditioning on running on CRAN or not. Why? There are many problems with using parallel::detectCores(). First of all, it can return NA_integer_ on some systems, so you cannot assume it gives a valid value (== error). It can also return 1L, which means your 'num_workers - 1' will give zero worker (== error). You need to account for that if you rely on detectCores(). Second, detectCores() returns number of physical CPU cores. It's getting more and more common to run in "cgroups" constrained environments where your R process only gets access to a fraction of these cores. Such constrains are in place in many shared multi-user HPC environments, and sometimes when using Linux containers (e.g. Docker, Apptainer, and Podman). A notable example of this is when using the RStudio Cloud. So, if you use detectCores() on those systems, you'll actually over-parallelize, which slows things down and you risk running out of memory. For example, you might launch 64 parallel workers when you only have access to four CPU cores. Each core will be clogged up by 16 workers. Third, if you default to detectCores() and a user runs your code on a machine shared by many users, the other users will not be happy. Note that the user will often not know they're overusing the machine. So, it's a loss-loss for everyone. Fourth, detectCores() will return *all* physical CPU cores on the current machine. These days we have machines with 128, 196, and more cores. Are you sure your software will actually run faster when using that many cores? The benefit from parallelization tends to decrease as you add more workers until there is no longer an speed improvement. If you keep adding more parallel workers you're going to see a negative effect, i.e. you're penalized when parallelization too much. So, be aware that when you test on 16 or 24 cores and things runs really fast, that might not be the experience for other users, or users in the future (who will have access to more CPU cores). So, yes, I suggest not to use num_workers <- detectCores(). Pick a fixed number instead, and the CRAN policy suggests using two. You can let the user control how many they want to use. As a developer, it's really really ... (read impossible) to know how many they want to use. Cheers, Henrik PS. Note that detectCores() returns a single integer value (possible NA_integer_). Because of this, there is no need to subset with num_workers[1]. I've seen this used in code; not sure where it comes from but it looks like a cut'n'paste behavior. On Wed, Nov 16, 2022 at 6:38 AM Riko Kelter <riko.kel...@uni-siegen.de> wrote: > > Hi Ivan, > > thanks for the info, I changed the check as you pointed out and it > worked. R CMD build and R CMD check --as-cran run without errors or > warnings on Linux + MacOS. However, I uploaded the package again at the > WINBUILDER service and obtained the following weird error: > > * checking re-building of vignette outputs ... ERROR > Check process probably crashed or hung up for 20 minutes ... killed > Most likely this happened in the example checks (?), > if not, ignore the following last lines of example output: > > ======== End of example output (where/before crash/hang up occured ?) ======== > > Strangely, there are no examples included in any .Rd file. Also, I > checked whether a piece of code spawns new clusters. However, the > critical lines are inside a function which is repeatedly called in the > vignettes. The parallelized part looks as copied below. After the code > is executed the cluster is stopped. I use registerDoSNOW(cl) because > otherwise my progress bar does not work. > > > Code: > > ############################### CHECK CORES > > chk <- tolower(Sys.getenv("_R_CHECK_LIMIT_CORES_", "")) > if (nzchar(chk) && (chk != "false")){ # then limit the workers > num_workers <- 2L > } else { > # use all cores > num_workers <- parallel::detectCores() > } > > chk <- Sys.getenv("_R_CHECK_LIMIT_CORES_", "") > > cl <- parallel::makeCluster(num_workers[1]-1) # not to overload your > computer > #doParallel::registerDoParallel(cl) > doSNOW::registerDoSNOW(cl) > > ############################### SET UP PROGRESS BAR > > pb <- progress_bar$new( > format = "Iteration = :letter [:bar] :elapsed | expected time till > finish: :eta", > total = nsim, # 100 > width = 120) > > progress_letter <- seq(1,nsim) # token reported in progress bar > > # allowing progress bar to be used in foreach > ----------------------------- > progress <- function(n){ > pb$tick(tokens = list(letter = progress_letter[n])) > } > > opts <- list(progress = progress) > > ############################### MAIN SIMULATION > > if(method=="PP"){ > finalMatrix <- foreach::foreach(s=1:nsim, .combine=rbind, .packages > = c("extraDistr", "fbst"), .options.snow = opts) %dopar% { > tempMatrix = singleTrial_PP(s = s, n=nInit, responseMatrix = > responseMatrix, nInit = nInit, Nmax = Nmax, batchsize = batchsize, a0 = > a0, b0 = b0) > > tempMatrix #Equivalent to finalMatrix = cbind(finalMatrix, > tempMatrix) > } > } > > if(method=="PPe"){ > refFunc = refFunc > nu = nu > shape1 = shape1 > shape2 = shape2 > if(refFunc == "flat"){ > finalMatrix <- foreach::foreach(s=1:nsim, .combine=rbind, > .packages = c("extraDistr", "fbst"), .options.snow = opts) %dopar% { > tempMatrix = singleTrial_PPe(s = s, n=nInit, responseMatrix = > responseMatrix, nInit = nInit, Nmax = Nmax, batchsize = batchsize, a0 = > a0, b0 = b0, refFunc = "flat") > > tempMatrix #Equivalent to finalMatrix = cbind(finalMatrix, > tempMatrix) > } > } > if(refFunc == "beta"){ > finalMatrix <- foreach::foreach(s=1:nsim, .combine=rbind, > .packages = c("extraDistr", "fbst"), .options.snow = opts) %dopar% { > tempMatrix = singleTrial_PPe(s = s, n=nInit, responseMatrix = > responseMatrix, nInit = nInit, Nmax = Nmax, batchsize = batchsize, a0 = > a0, b0 = b0, refFunc = "beta", > shape1 = shape1, shape2 = shape2) > > tempMatrix #Equivalent to finalMatrix = cbind(finalMatrix, > tempMatrix) > } > } > if(refFunc == "binaryStep"){ > finalMatrix <- foreach::foreach(s=1:nsim, .combine=rbind, > .packages = c("extraDistr", "fbst"), .options.snow = opts) %dopar% { > tempMatrix = singleTrial_PPe(s = s, n=nInit, responseMatrix = > responseMatrix, nInit = nInit, Nmax = Nmax, batchsize = batchsize, a0 = > a0, b0 = b0, refFunc = "binaryStep", > shape1 = shape1, shape2 = shape2, > truncation = truncation) > > tempMatrix #Equivalent to finalMatrix = cbind(finalMatrix, > tempMatrix) > } > } > if(refFunc == "relu"){ > finalMatrix <- foreach::foreach(s=1:nsim, .combine=rbind, > .packages = c("extraDistr", "fbst"), .options.snow = opts) %dopar% { > tempMatrix = singleTrial_PPe(s = s, n=nInit, responseMatrix = > responseMatrix, nInit = nInit, Nmax = Nmax, batchsize = batchsize, a0 = > a0, b0 = b0, refFunc = "relu", > shape1 = shape1, shape2 = shape2, > truncation = truncation) > > tempMatrix #Equivalent to finalMatrix = cbind(finalMatrix, > tempMatrix) > } > } > if(refFunc == "palu"){ > finalMatrix <- foreach::foreach(s=1:nsim, .combine=rbind, > .packages = c("extraDistr", "fbst"), .options.snow = opts) %dopar% { > tempMatrix = singleTrial_PPe(s = s, n=nInit, responseMatrix = > responseMatrix, nInit = nInit, Nmax = Nmax, batchsize = batchsize, a0 = > a0, b0 = b0, refFunc = "palu", > shape1 = shape1, shape2 = shape2, > truncation = truncation) > > tempMatrix #Equivalent to finalMatrix = cbind(finalMatrix, > tempMatrix) > } > } > if(refFunc == "lolu"){ > finalMatrix <- foreach::foreach(s=1:nsim, .combine=rbind, > .packages = c("extraDistr", "fbst"), .options.snow = opts) %dopar% { > tempMatrix = singleTrial_PPe(s = s, n=nInit, responseMatrix = > responseMatrix, nInit = nInit, Nmax = Nmax, batchsize = batchsize, a0 = > a0, b0 = b0, refFunc = "lolu", > shape1 = shape1, shape2 = shape2, > truncation = truncation) > > tempMatrix #Equivalent to finalMatrix = cbind(finalMatrix, > tempMatrix) > } > } > } > > ############################### STOP CLUSTER > > parallel::stopCluster(cl) #stop cluster > > > > Kind regards, > > Riko > > > Am 16.11.22 um 08:29 schrieb Ivan Krylov: > > В Wed, 16 Nov 2022 07:29:25 +0100 > > Riko Kelter<riko.kel...@uni-siegen.de> пишет: > > > >> if (nzchar(chk) && chk == "TRUE") { > >> # use 2 cores in CRAN/Travis/AppVeyor > >> num_workers <- 2L > >> } > > The check in parallel:::.check_ncores is a bit different: > > > > chk <- tolower(Sys.getenv("_R_CHECK_LIMIT_CORES_", "")) > > if (nzchar(chk) && (chk != "false")) # then limit the workers > > > > Unless you actually set _R_CHECK_LIMIT_CORES_=FALSE on your machine > > when running the checks, I would perform a more pessimistic check of > > nzchar(chk) (without additionally checking whether it's TRUE or not > > FALSE), though copy-pasting the check from parallel:::.check_ncores > > should also work. > > > > Can we see the rest of the vignette? Perhaps the problem is not with > > the check. For example, a piece of code might be implicitly spawning a > > new cluster, defaulting to all of the cores instead of num_workers. > > > >> [[alternative HTML version deleted]] > > Unfortunately, the plain text version of your message prepared by your > > mailer has all the code samples mangled: > > https://stat.ethz.ch/pipermail/r-package-devel/2022q4/008647.html > > > > Please compose your messages to R mailing lists in plain text. > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-package-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-package-devel ______________________________________________ R-package-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-package-devel