Our current cluster is running Centos 7.9 and we are anticipating setting up a new cluster by the end of the year that will most likely be running one of the Centos 8.x alternatives (Rocky/Alma/???) with the latest version of Slurm.
Our team is investigating whether it would be appropriate to run the Slurm server for this upcoming cluster as a container instance. By a server, we mean slurmctld and slurmdbd, with the database as a separate system/container. Is anyone here doing this? If so, are there any advantages to doing so? Are there any drawbacks? What containerization solution / platform are you using? We would most likely be using Kubernetes. What sort of challenges have you experienced and how did you solve them? Are there any best practices that you would recommend? Are there any other HPC institutions you know are doing this that we should reach out to ? Lee Reynolds Senior RC Architect RTO Research Computing Arizona State University Mail Code: 6011 Tempe, AZ 85287-5206 p: 480-965-9460<tel:480-965-9460> email: lee.reyno...@asu.edu<mailto:lee.reyno...@asu.edu> web: https://cores.research.asu.edu/research-computing/about<https://cores.research.asu.edu/research-computing/about-rc>