Re: [slurm-users] srun : Communication connection failure

2022-01-25 Thread Ryan Novosielski
I’m coming to this question late, and this is not the answer to your problem (well, maybe tangentially), but it may help someone else: my recollection is that the compute node that gets assigned the job must be able to contact the node you’re starting the interactive job from (so bg-slurmb-login

Re: [slurm-users] srun : Communication connection failure

2022-01-20 Thread Durai Arasan
Hello slurm users, I forgot to mention that an identical interactive job works successfully on the gpu partitions (in the same cluster). So this is really puzzling. Best, Durai Arasan MPI Tuebingen On Thu, Jan 20, 2022 at 3:40 PM Durai Arasan wrote: > Hello Slurm users, > > We are suddenly enc

[slurm-users] srun : Communication connection failure

2022-01-20 Thread Durai Arasan
Hello Slurm users, We are suddenly encountering strange errors while trying to launch interactive jobs on our cpu partitions. Have you encountered this problem before? Kindly let us know. [darasan84@bg-slurmb-login1 ~]$ srun --job-name "admin_test231" --ntasks=1 --nodes=1 --cpus-per-task=1 --part