Hi Mike,
If you run the Slurm daemons in a container, but the Slurm commands are run
from the host, you need to make sure that the Slurm commands on the host and
the Slurm daemons in the container are running similar versions of Slurm.
Otherwise, the commands may not be able to communicate with the daemons if
there's a protocol change. We were running the Slurm daemons in a container
and ran into a problem where the commands could no longer communicate with the
daemons when the version of Slurm on the host was updated, but the container
was still running an older version of the daemons.
Rigoberto
On Wednesday, February 15, 2023 at 01:52:33 PM EST, Hanby, Mike
<mha...@uab.edu> wrote:
<!--#yiv0249574192 filtered {}#yiv0249574192 filtered {}#yiv0249574192
p.yiv0249574192MsoNormal, #yiv0249574192 li.yiv0249574192MsoNormal,
#yiv0249574192 div.yiv0249574192MsoNormal
{margin:0in;font-size:11.0pt;font-family:"Calibri", sans-serif;}#yiv0249574192
a:link, #yiv0249574192 span.yiv0249574192MsoHyperlink
{color:#0563C1;text-decoration:underline;}#yiv0249574192
span.yiv0249574192EmailStyle17 {font-family:"Calibri",
sans-serif;color:windowtext;}#yiv0249574192 .yiv0249574192MsoChpDefault
{}#yiv0249574192 filtered {}#yiv0249574192 div.yiv0249574192WordSection1 {}-->
Howdy,
Just wondering if any sites are running containerized Slurmctld and Slurmdbd in
production?
We are in the process of planning migrating from a single host running
slurmctld, slurmdbd, and MySQL (and other HPC services) to separate OpenStack
VMs. Our site averages less than 1000’s running / pending jobs at any given
time. Like many HPC sites, our jobs are a mix of long running, large arrays,
very short…
I ran across this Github project “Slurm Docker
Cluster”https://github.com/giovtorres/slurm-docker-cluster and got me thinking
that this method might be great for simpler upgrades, ease of reproducing the
cluster in development, etc…
How about it, anyone running containerized Slurm server processes in production?
Thanks, Mike