Re: [R] Overnight Cluster (Whitepaper)?

Ben Bolker Wed, 30 Apr 2025 07:44:44 -0700

HTCondor has been around for a long time (originally as "Condor",started in 1988!)


https://github.com/htcondor/htcondor
https://htcondor.org/
https://en.wikipedia.org/wiki/HTCondor

I have no idea about the scale of difficulty of setting this up. Thedevelopers do offer contract support <https://htcondor.org/uw-support/>


On 2025-04-30 10:12 a.m., Ivan Krylov via R-help wrote:

Dear Ivo Welch,

Sorry for not answering the question you asked (I don't know such a
vendor), but here are a few comments that may help:

On Tue, 29 Apr 2025 17:20:25 -0700
ivo welch <ivo.we...@ucla.edu> wrote:

These computers are mostly idle overnight.  We have no interest in
bitmining and SETI@home doesn't seem so very active any more, either.
Alas, it's 2025 now, so maybe there is something better we could do
with all this idle compute power when it comes to our own statistical
analyses. Maybe we could cluster them overnight.


The state of the art in volunteer computing is still BOINC, the same
system that powers most of the "@home" projects. It lets the user
control when to run the jobs and when to stop (e.g. run jobs overnight
but only if the system is not under load by something else) and doesn't
require the job submitter to be able to log in to the worker nodes or
even rely on the nodes being able to accept incoming connections.

It's possible to run a BOINC server yourself [1], although the server
side will take some work to set up, and the jobs need to be specially
packaged. In theory, one could package R as a BOINC app and arrange for
it to run jobs serialized into *.rds files, but it's a lot of
infrastructure work to place all the moving parts in correct positions
(package versions alone are a serious problem with no easy solution).

Ideally, we would then have a frontend R (controller) that could run
`mclapply` statements on this Franken-computer, and be smart enough
about how to distribute the load.


One problem with parLapply() is that it expects the cluster object to
be a list containing a fixed number of node objects. I've experimented
with a similar problem: I needed to distribute jobs between my
colleagues' workstations when they could spare some CPU power, letting
computers leave and rejoin the cluster at will. In the end, I had to
pretend that my 'parallel' cluster always contained an excessive number
of nodes (128) and distribute the larger number of smaller sub-tasks
dynamically.

A general-purpose interface for a volunteer cluster will probably not
work as a drop-in replacement for mclapply(). You might be able to
achieve part of what you want using 'mirai', telling every worker node
to connect to the client node for tasks. BOINC can set memory and CPU
core limits, but it might be unable to save you from inefficient job
plans. See 'future.batchtools' for an example of an R interface for
cluster job submission systems.


--
Dr. Benjamin Bolker
Professor, Mathematics & Statistics and Biology, McMaster University
Director, School of Computational Science and Engineering

> E-mail is sent at my convenience; I don't expect replies outside ofworking hours.


______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide https://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Overnight Cluster (Whitepaper)?

Reply via email to