date:20220120

[slurm-users] srun : Communication connection failure

2022-01-20 Thread Durai Arasan

Hello Slurm users, We are suddenly encountering strange errors while trying to launch interactive jobs on our cpu partitions. Have you encountered this problem before? Kindly let us know. [darasan84@bg-slurmb-login1 ~]$ srun --job-name "admin_test231" --ntasks=1 --nodes=1 --cpus-per-task=1 --part

Re: [slurm-users] srun : Communication connection failure

2022-01-20 Thread Durai Arasan

Hello slurm users, I forgot to mention that an identical interactive job works successfully on the gpu partitions (in the same cluster). So this is really puzzling. Best, Durai Arasan MPI Tuebingen On Thu, Jan 20, 2022 at 3:40 PM Durai Arasan wrote: > Hello Slurm users, > > We are suddenly enc

Re: [slurm-users] [External] Re: srun : Communication connection failure

2022-01-20 Thread Michael Robbert

It looks like it could be some kind of network problem but could be DNS. Can you ping and do DNS resolution for the host involved? What does slurmctld.log say? How about slurmd.log on the node in question? Mike From: slurm-users on behalf of Durai Arasan Date: Thursday, January 20, 2022 at 08

[slurm-users] memory per node default

2022-01-20 Thread Hoot Thompson

How do you change the default memory per node from the current 1MB to something much higher? Thanks in advance. *ubuntu@node*:*/shared*$ sinfo -o "%20N%10c%10m%25f%10G " NODELISTCPUSMEMORYAVAIL_FEATURES GRES hpc-demand-dy-c5n18x361 dynamic,c5n.18xlarge,c5n1(null)

Re: [slurm-users] memory per node default

2022-01-20 Thread Ole Holm Nielsen

On 1/20/22 22:22, Hoot Thompson wrote: How do you change the default memory per node from the current 1MB to something much higher? Thanks in advance. *ubuntu@node*:*/shared*$ sinfo -o "%20N%10c%10m%25f%10G " NODELISTCPUSMEMORYAVAIL_FEATURES GRES hpc-demand-dy-c5n18x361 dynamic,c5n.18xlarge

[slurm-users] srun : Communication connection failure

Re: [slurm-users] srun : Communication connection failure

Re: [slurm-users] [External] Re: srun : Communication connection failure

[slurm-users] memory per node default

Re: [slurm-users] memory per node default

5 matches

Site Navigation

Mail list logo

Footer information