Re: [slurm-users] slurmstepd: error: load_ebpf_prog: BPF load error (No space left on device). Please check your system limits (MEMLOCK).

2024-01-24 Thread Charles Hedrick
Since they took the patch, it's not needed if you're using the version they fixed. However it looks like they haven't released that version yet. The patch is to slurmd. You don't need it on the controller. If you're only having problems with some systems, you can put it just on those systems, bu

Re: [slurm-users] GPU devices mapping with job's cgroup in cgroups v2 using eBPF

2024-01-23 Thread Charles Hedrick
To see the specific GPU allocated, I think this will do it: scontrol show job -d | grep -E "JobId=| GRES" From: slurm-users on behalf of Mahendra Paipuri Sent: Sunday, January 7, 2024 3:33 PM To: slurm-users@lists.schedmd.com Subject: [slurm-users] GPU devices

Re: [slurm-users] slurmstepd: error: load_ebpf_prog: BPF load error (No space left on device). Please check your system limits (MEMLOCK).

2024-01-23 Thread Charles Hedrick
See my comments on https://bugs.launchpad.net/bugs/2050098. There's a pretty simple fix in slurm. As far as I can tell, there's nothing wrong with the slurm code. But it's using an option that it doesn't actually need, and that seems to be causing trouble in the kernel. __

[slurm-users] problem with AllowedSwapSpace

2023-08-26 Thread Charles Hedrick
With cgroup v2 allowedswapspace is implemented using memory.swap.max. In v1, it is memsw.max. This applies to the total of memory and swap. In v2, memory.swap.max is only swap. Slurm adds the job memory size to allowedswapspace. This is appropriate for v1, since the limit is on the sum. It is no

[slurm-users] cgroup swap limit cgroups v2

2023-08-26 Thread Charles Hedrick
It appears that when you set AllowedSwapSpace in cgroups.conf, this percent is added to the requested memory amount. So to set memoryswapmax to 0, you have to set AllowedSwapSpace to -100. Arithmetic is imprecise and parts seem to use signed float, so you can get unexpected results. I'm actually