Call healthcheck with a shell script that starts with:
sleep $[ ( $RANDOM % 10 )  + 1 ], or similar.

M.K.
________________________________
From: slurm-users <slurm-users-boun...@lists.schedmd.com> on behalf of SJTU 
<weijian...@sjtu.edu.cn>
Sent: Thursday, November 26, 2020 8:24 PM
To: slurm-users@lists.schedmd.com <slurm-users@lists.schedmd.com>
Subject: [slurm-users] Set a ramdom offset when starting node health check in 
SLURM

Hi,

   We uses HealthCheckProgram = /usr/sbin/nhc in slurm to check node health 
every 600 seconds. However, some NHC checks points to a same central resource 
thus starting these checks simultaneously may lead to false alarms of service 
degrade.

   Is it possible  to set a random offset to when HealthCheckProgram starts?


Thank you!

Jianwen

Reply via email to