[slurm-users] Get Information from a Node to the MailProg Command / Add arbitrary information to a job

Matthias Loose Tue, 15 Jun 2021 23:41:09 -0700

Hi Slurm Users,

first time posting. I have a new slurm setup where the users can specifyan amount of local node disk space they wish to use. This is a "gres"resource named "local" and it measures in GB. Once the user hasscheduled a job and it gets executed, I create a folder for this job onthe node and add a XFS project quota for this job with the requestedamount as soft and +5% as hard limit in the node prolog. Then the usersget this folder set as their $TMPDIR in the user prolog. Lastly I removethe quota and folder on job completion via the node epilog.

This all works great so far. Now I was busying myself with creating anemail script, that would notify the users if the "local" was used up.Since slurm itself has no idea what the gres: local actually is and isonly managing it as a number I have to do it myself. My thought was thatI would check the quota on job termination in the node epilog to seewhere the quota is at, but Ive now ran into the snag on how to get thisinformation to the mailprog, configured in the slurm.conf.


The arguments to that program appear to be always in this form:

-s SLURM Job_id=327 Name=ddt_clone Ended, Run time 00:05:01,COMPLETED, ExitCode 0

and the environment of the script only contains the cluster name andnothing else.

The question now becomes, how do I get information about the quotastatus at the end of the job from the node epilog, to the mailprogrunning on the head node. I can parse the jobID from the argument lineto the script and thus can get all information via scontrol. So my firstthought was if I could add my own data field to that output, it wouldsolve my problem. Unfortunately I cant seem to find such an option.

Other than that Ive only come up with writing some sort of file to ashared storage mount that could be read by the mailprog.

Can you think of a more elegant solution to add this information to thejob so that it can be access on the head by the mailprog with the jobid?


Any help is appreciated!

[slurm-users] Get Information from a Node to the MailProg Command / Add arbitrary information to a job

Reply via email to