Very interesting problem! Have you posted on Hacker News? This is the only such system I have used -- https://research.google/pubs/large-scale-cluster-management-at-google-with-borg/
On Wed, Apr 30, 2025 at 4:48 AM ivo welch <ivo.we...@ucla.edu> wrote: > We have about 50 different mac computers, all ARM, distributed across our > offices. They range from a few M1's with 8 GB all the way to M4's with 64 > GB. (The M4 mini for $600 is an amazing compute engine!) > > These computers are mostly idle overnight. We have no interest in > bitmining and SETI@home doesn't seem so very active any more, either. > Alas, it's 2025 now, so maybe there is something better we could do with > all this idle compute power when it comes to our own statistical analyses. > Maybe we could cluster them overnight. > > I likely could convince my colleagues to run a cron job (or systemctl, well > loadctl) that starts listening at 7pm and ends it around 7am, sharing say > 80% of their memory and CPU, plus say 32GB of SSD. I won't be able to > actively administer their computers, so the client has to be easy for them > to install, turn on, and turn off, accept programs and inputs, cache some > of the data, and send back output. (The sharing would only be on the local > network, not the entire internet, making them feel more comfortable with > it.) > > Ideally, we would then have a frontend R (controller) that could run > `mclapply` statements on this Franken-computer, and be smart enough about > how to distribute the load. For example, an M4 is about 1.5x as fast as an > M1 on a single CPU, and it's easy to count up CPUs. If my job is estimated > to need 4GB per core, presumably I wouldn't want to start 50 processes on a > computer that has 10 cores and 8GB. If the frontend estimates that the > upload and download will take longer than the savings, it should just > forget about distributing it. And so on. Reasonable rules, perhaps > indicated by the user and/or assessable from a few local mclapply runs > first. It's almost like profiling the job for a few minutes or few > iterations locally, and then deciding whether to send off parts of it to > all the other computer nodes on this Franken-net. > > I am not holding my breath on ChatGPT and artificial intelligence, of > course. However, this seems like a hard but feasible engineering problem. > Is there a vendor who sells a plug-and-play solution to this problem? I am > guessing we are not unusual in a setup like this, though an upper price > bound on the software here is of course just the cost of buying a giant > homogeneous computer or using Amazon resources. > > Pointers appreciated. > > /iaw > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > https://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide https://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.