[slurm-users] Re: slurmctld hourly: Unexpected missing socket error

2024-07-29 Thread Jason Ellul via slurm-users
termaccc> From: Patryk Bełzak via slurm-users Date: Wednesday, 24 July 2024 at 8:03 PM To: Jason Ellul via slurm-users Subject: [slurm-users] Re: slurmctld hourly: Unexpected missing socket error ! EXTERNAL EMAIL: Think before you click. If suspicious send to cyberrep...@petermac.org Hi, we&#

[slurm-users] Re: slurmctld hourly: Unexpected missing socket error

2024-07-22 Thread Jason Ellul via slurm-users
e: Monday, 22 July 2024 at 6:03 PM To: Jason Ellul via slurm-users Subject: [slurm-users] Re: slurmctld hourly: Unexpected missing socket error ! EXTERNAL EMAIL: Think before you click. If suspicious send to cyberrep...@petermac.org Hi, we've been facing the same issue for some time. At th

[slurm-users] slurmctld hourly: Unexpected missing socket error

2024-07-15 Thread Jason Ellul via slurm-users
Hi all, I am hoping someone can help with our problem. Every hour after restarting slurmctld the controller becomes unresponsive to commands for 1 sec, reporting errors such as: [2024-07-15T11:45:48.509] error: slurm_send_node_msg: [socket:[934767]] slurm_bufs_sendto(msg_type=RESPONSE_JOB_INFO