Re: [slurm-users] Slurm / OpenHPC socket timeout errors

2018-11-27 Thread Marcus Wagner
, 2018 9:38 AM *To:* slurm-users@lists.schedmd.com *Subject:* Re: [slurm-users] Slurm / OpenHPC socket timeout errors I wasn’t looking close enough at the times in the log file. c2: [2018-11-26T10:*_09:40_*.963] debug3: in the service_connection c2: [2018-11-26T10:*_10:00_*.983] debug

Re: [slurm-users] Slurm / OpenHPC socket timeout errors

2018-11-26 Thread Michael Robbert
t; On Behalf Of Kenneth Roberts Sent: Monday, November 26, 2018 9:38 AM To: slurm-users@lists.schedmd.com<mailto:slurm-users@lists.schedmd.com> Subject: Re: [slurm-users] Slurm / OpenHPC socket timeout errors I wasn’t looking close enough at the times in the log file. c2: [2018-11-26T10:09:40

Re: [slurm-users] Slurm / OpenHPC socket timeout errors

2018-11-26 Thread Kenneth Roberts
users On Behalf Of Kenneth Roberts Sent: Monday, November 26, 2018 9:38 AM To: slurm-users@lists.schedmd.com Subject: Re: [slurm-users] Slurm / OpenHPC socket timeout errors I wasn't looking close enough at the times in the log file. c2: [2018-11-26T10:09:40.963] debu

Re: [slurm-users] Slurm / OpenHPC socket timeout errors

2018-11-26 Thread Kenneth Roberts
g out after 20 seconds. Back to finding out why ... From: slurm-users On Behalf Of Kenneth Roberts Sent: Monday, November 26, 2018 8:35 AM To: slurm-users@lists.schedmd.com Subject: Re: [slurm-users] Slurm / OpenHPC socket timeout errors Here is the debug log on a node (c2) when the job fa

Re: [slurm-users] Slurm / OpenHPC socket timeout errors

2018-11-26 Thread Kenneth Roberts
ors reading slurm.conf ... Continuing the search ... From: slurm-users On Behalf Of Kenneth Roberts Sent: Friday, November 23, 2018 4:15 PM To: slurm-users@lists.schedmd.com Subject: [slurm-users] Slurm / OpenHPC socket timeout errors Hi - I have the following on a new cluster

[slurm-users] Slurm / OpenHPC socket timeout errors

2018-11-23 Thread Kenneth Roberts
Hi - I have the following on a new cluster with OpenHPC & Slurm built off the latest recipe and packages from OpenHPC (built this week). One master node and 4 compute nodes. NodeName=c[1-4] Sockets=2 CoresPerSocket=10 ThreadsPerCore=1 State=UNKNOWN With simple test scripts, sbatch prod