On Monday, 02 September 2019, at 20:02:57 (+0200), Ole Holm Nielsen wrote: > We have some users requesting that a certain minimum size of the > *Available* (i.e., free) TmpFS disk space should be present on nodes > before a job should be considered by the scheduler for a set of > nodes. > > I believe that the "sbatch --tmp=size" option merely refers to the > TmpFS file system *Size* as configured in slurm.conf, and this is > *not* what users need. > > For example, a job might require 50 GB of *Available disk space* on > the TmpFS file system, which may however have only 20 GB out of 100 > GB *Available* as shown by the df command, the rest having been > consumed by other jobs (present or past). > > However, when we do "scontrol show node <nodename>", only the TmpFS > file system *Size* is displayed as a "TmpDisk" number, but not the > *Available* number. > > Question: How can we get slurmd to report back to the scheduler the > amount of *Available* disk space? And how can users specify the > minimum *Available* disk space required by their jobs submitted by > "sbatch"? > > If this is not feasible, are there other techniques that achieve the > same goal? We're currently still at Slurm 18.08.
Hi, Ole! I'm assuming you are wanting a per-job resolution on this rather than per-node? If per-node is good enough, you can of course use NHC to check this, e.g.: * || check_fs_free /tmp 50GB That doesn't work per-job, though, obviously. Something that might work, however, as a temporary work-around for this might be to have the user run a single NHC command, like this: srun --prolog='nhc -e "check_fs_free /tmp 50GB"' There might be some tweaks/caveats to this since NHC normally runs as root, but just an idea.... :-) An even crazier idea would be to set NHC_LOAD_ONLY=1 in the environment, source /usr/sbin/nhc, and then execute the shell function `check_fs_free` directly! :-D HTH, Michael -- Michael E. Jennings <m...@lanl.gov> HPC Systems Team, Los Alamos National Laboratory Bldg. 03-2327, Rm. 2341 W: +1 (505) 606-0605