The native job_container/tmpfs would certainly have access to the job record, so modification to it (or a forked variant) would be possible. A SPANK plugin should be able to fetch the full job record [1] and is then able to inspect the "gres" list (as a C string), which means I could modify UD's auto_tmpdir accordingly. Having a compiled plugin executing xfs_quota to effect the commands illustrated wouldn't be a great idea -- luckily Linux XFS has an API. Seemingly not the simplest one, but xfsprogs is a working example.
[1] https://gitlab.hpc.cineca.it/dcesari1/slurm-msrsafe > On Feb 7, 2024, at 05:25, Tim Schneider via slurm-users > <slurm-users@lists.schedmd.com> wrote: > > Hey Jeffrey, > thanks for this suggestion! This is probably the way to go if one can find a > way to access GRES in the prolog. I read somewhere that people were calling > scontrol to get this information, but this seems a bit unclean. Anyway, if I > find some time I will try it out. > Best, > Tim > On 2/6/24 16:30, Jeffrey T Frey wrote: >> Most of my ideas have revolved around creating file systems on-the-fly as >> part of the job prolog and destroying them in the epilog. The issue with >> that mechanism is that formatting a file system (e.g. mkfs.<type>) can be >> time-consuming. E.g. formatting your local scratch SSD as an LVM PV+VG and >> allocating per-job volumes, you'd still need to run a e.g. mkfs.xfs and >> mount the new file system. >> >> >> ZFS file system creation is much quicker (basically combines the LVM + mkfs >> steps above) but I don't know of any clusters using ZFS to manage local file >> systems on the compute nodes :-) >> >> >> One could leverage XFS project quotas. E.g. for Slurm job 2147483647: >> >> >> [root@r00n00 /]# mkdir /tmp-alloc/slurm-2147483647 >> [root@r00n00 /]# xfs_quota -x -c 'project -s -p /tmp-alloc/slurm-2147483647 >> 2147483647' /tmp-alloc >> Setting up project 2147483647 (path /tmp-alloc/slurm-2147483647)... >> Processed 1 (/etc/projects and cmdline) paths for project 2147483647 with >> recursion depth infinite (-1). >> [root@r00n00 /]# xfs_quota -x -c 'limit -p bhard=1g 2147483647' /tmp-alloc >> [root@r00n00 /]# cd /tmp-alloc/slurm-2147483647 >> [root@r00n00 slurm-2147483647]# dd if=/dev/zero of=zeroes bs=5M count=1000 >> dd: error writing ‘zeroes’: No space left on device >> 205+0 records in >> 204+0 records out >> 1073741824 bytes (1.1 GB) copied, 2.92232 s, 367 MB/s >> >> : >> >> [root@r00n00 /]# rm -rf /tmp-alloc/slurm-2147483647 >> [root@r00n00 /]# xfs_quota -x -c 'limit -p bhard=0 2147483647' /tmp-alloc >> >> >> Since Slurm jobids max out at 0x03FFFFFF (and 2147483647 = 0x7FFFFFFF) we >> have an easy on-demand project id to use on the file system. Slurm tmpfs >> plugins have to do a mkdir to create the per-job directory, adding two >> xfs_quota commands (which run in more or less O(1) time) won't extend the >> prolog by much. Likewise, Slurm tmpfs plugins have to scrub the directory at >> job cleanup, so adding another xfs_quota command will not do much to change >> their epilog execution times. The main question is "where does the tmpfs >> plugin find the quota limit for the job?" >> >> >> >> >> >>> On Feb 6, 2024, at 08:39, Tim Schneider via slurm-users >>> <slurm-users@lists.schedmd.com> wrote: >>> >>> Hi, >>> >>> In our SLURM cluster, we are using the job_container/tmpfs plugin to ensure >>> that each user can use /tmp and it gets cleaned up after them. Currently, >>> we are mapping /tmp into the nodes RAM, which means that the cgroups make >>> sure that users can only use a certain amount of storage inside /tmp. >>> >>> Now we would like to use of the node's local SSD instead of its RAM to hold >>> the files in /tmp. I have seen people define local storage as GRES, but I >>> am wondering how to make sure that users do not exceed the storage space >>> they requested in a job. Does anyone have an idea how to configure local >>> storage as a proper tracked resource? >>> >>> Thanks a lot in advance! >>> >>> Best, >>> >>> Tim >>> >>> >>> -- >>> slurm-users mailing list -- slurm-users@lists.schedmd.com >>> To unsubscribe send an email to slurm-users-le...@lists.schedmd.com >> > > -- > slurm-users mailing list -- slurm-users@lists.schedmd.com > To unsubscribe send an email to slurm-users-le...@lists.schedmd.com -- slurm-users mailing list -- slurm-users@lists.schedmd.com To unsubscribe send an email to slurm-users-le...@lists.schedmd.com