Re: [slurm-users] Using "srun" on compute nodes -- Ray cluster

2022-07-15 Thread Reed Dier
I have some users that are using ray on slurm. I will preface by saying we are new slurm users, so may not be doing everything exactly correct. The only issue that we came across so far as something that was somewhat ray specific that we ran into. Specifically, and pardon my lack of specificity,

Re: [slurm-users] Using "srun" on compute nodes -- Ray cluster

2022-07-15 Thread Ryan Novosielski
Are you talking about a script that is run via sbatch containing srun command lines? If so, there are a lot of reasons to do that. One is better instrumentation, as I understand it, but also srun --mpi is a way to eliminate mpiexec/mpirun/etc., and is what we recommend at our site instead (usin

[slurm-users] Using "srun" on compute nodes -- Ray cluster

2022-07-15 Thread Kamil Wilczek
Dear Slurm Users, one of my cluster users would like to run a Ray cluster on Slurm. I noticed that the batch script example requires running the "srun" command on a compute node, which already is allocated: https://docs.ray.io/en/latest/cluster/examples/slurm-template.html#slurm-template This is