[slurm-users] Seeking Commercial SLURM Subscription Provider

2024-08-12 Thread John Joseph via slurm-users
Dear All, Good morning. We successfully implemented a 4-node SLURM cluster with shared storage using GlusterFS and were able to run COMSOL programs on it. After this learning experience, we've determined that it would be beneficial to switch to a commercial SLURM subscription for better suppo

[slurm-users] Re: Annoying canonical question about converting SLURM_JOB_NODELIST to a host list for mpirun

2024-08-12 Thread Paul Edmon via slurm-users
Ah, that's even more fun. I know with Singularity you can launch MPI applications by calling MPI outside of the container and then having it link to the internal version: https://docs.sylabs.io/guides/3.3/user-guide/mpi.html  Not sure about docker though. -Paul Edmon- On 8/12/2024 10:30 AM,

[slurm-users] Re: Annoying canonical question about converting SLURM_JOB_NODELIST to a host list for mpirun

2024-08-12 Thread Jeffrey Layton via slurm-users
It's in a container. Specifically horovod/horovod on the Docker hub. I'm going into the container to investigate now (I think I have a link to the dockerfile as well). Thanks! Jeff On Mon, Aug 12, 2024 at 10:01 AM Paul Edmon wrote: > Certainly a strange setup. I would probably talk with who e

[slurm-users] Re: Annoying canonical question about converting SLURM_JOB_NODELIST to a host list for mpirun

2024-08-12 Thread Paul Edmon via slurm-users
Certainly a strange setup. I would probably talk with who ever is providing MPI for you and ask them to build it against Slurm properly. As in order to get correct process binding you definitely want to have it integrated properly with slurm either via PMI2 or PMIx. If you just use the bare hos

[slurm-users] Re: Annoying canonical question about converting SLURM_JOB_NODELIST to a host list for mpirun

2024-08-12 Thread Jeffrey Layton via slurm-users
Paul, I tend not to rely on the MPI being built with Slurm :) I find that the systems I use haven't done that. :( I'm not exactly sure why, but that is the way it is :) Up to now, using scontrol has always worked for me. However, a new system is not cooperating (it is running on the submittal h

[slurm-users] Re: Annoying canonical question about converting SLURM_JOB_NODELIST to a host list for mpirun

2024-08-12 Thread Paul Edmon via slurm-users
Normally MPI will just pick up the host list from Slurm itself. You just need to build MPI against Slurm and it will just grab it. Typically this is transparent to the user. Normally you shouldn't need to pass a host list at all. See: https://slurm.schedmd.com/mpi_guide.html The canonical way

[slurm-users] Re: Annoying canonical question about converting SLURM_JOB_NODELIST to a host list for mpirun

2024-08-12 Thread Jeffrey Layton via slurm-users
Thanks! I admit I'm not that experienced in Bash. I will give this a whirl as a test. In the meantime, let ask, what is the "canonical" way to create the host list? It would be nice to have this in the Slurm FAQ somewhere. Thanks! Jeff On Fri, Aug 9, 2024 at 1:32 PM Hermann Schwärzler via slu