Re: [slurm-users] fast way for a node to determine its own state?

Bjørn-Helge Mevik Wed, 21 Mar 2018 04:24:08 -0700

Alexis Huxley <alexis.hux...@mpcdf.mpg.de> writes:

>> >Depending on the load on the scheduler, this can be slow. Is there
>> >faster way? Perhaps one that doesn't involve communicating with
>> >the scheduler node? Thanks!
>
> Thanks for the suggestion Ole, but we have something in place that
> we don't want to change at this time. We just need a faster way
> for a node to get its own status.


How about running sinfo or scontrol show job in a cron job on the
controller node, say once every minute, saving the output to a file?
Then the nodes can simply grep in the file.  We use this in our crontab:

*/2 * * * * scontrol --oneliner show node > /cluster/var/node-info.new 
2>/dev/null && mv -f /cluster/var/node-info.new /cluster/var/node-info 
2>/dev/null

So, every 2. minute, the /cluster/var/node-info is updated (if the
scontrol command succeeds), and the nodes simply grep in that file.

Naturally, /cluster/var must be available on all nodes for this to work,
but we usually notice when the cluster file system goes down anyway. :)

-- 
Bjørn-Helge Mevik, dr. scient,
Department for Research Computing, University of Oslo

signature.asc
Description: PGP signature

Re: [slurm-users] fast way for a node to determine its own state?

Reply via email to