[slurm-users] Problem with srun on ARM Ubuntu servers

2023-07-21 Thread Daniel L'Hommedieu
Hi, everyone. My team runs a SLURM cluster, currently SLURM17, but we are working to upgrade to 22, of about 800 servers. We currently have only x64 front-end servers, but we are looking to add some ARM servers. I have deployed some new ARM front end servers in exactly the same way the x64 on

Re: [slurm-users] [EXT] --mem is not limiting the job's memory

2023-07-21 Thread Boris Yazlovitsky
Thanks folks to all who responded! setting SelectTypeParameters = CR_CPU_Memory did the trick. On Fri, Jun 23, 2023 at 3:21 AM Shunran Zhang < szh...@ngs.gen-info.osaka-u.ac.jp> wrote: > Hi > > Would you mind to check your job scheduling settings in slurm.conf ? > > Namely *SelectTypeParameters

Re: [slurm-users] slurm sinfo format memory

2023-07-21 Thread Michael DiDomenico
another option besides those mentioned would be to frontend sinfo with jq/python and parse the data through json/yaml On Thu, Jul 20, 2023 at 12:28 PM Arsene Marian Alain wrote: > > > > Dear slurm users, > > > > I would like to see the following information of my nodes "hostname, total > mem, fr

Re: [slurm-users] [EXTERNAL]Re: slurm sinfo format memory

2023-07-21 Thread Roberto Monti
The proposed solution will break for values >=1000G. As sinfo is apparently stuck with megabytes, you will have to do something like: sinfo -o "%n %m %e %C" | awk '$3 ~ /[0-9]+/ {printf "%s %iG %iG %s\n", $1, $2 / 1024, $3 / 1024, $4}' numfmt is another option, but it is only with newer version

Re: [slurm-users] configure script can't find nvml.h or libnvidia-ml.so

2023-07-21 Thread Jan Andersen
Right, so I have managed to get the nvidia tools installed and I can see the files now: root@zorn:~/slurm-23.02.3# find / -xdev -name libnvidia-ml.so /usr/local/cuda-12.2/targets/x86_64-linux/lib/stubs/libnvidia-ml.so /usr/lib/x86_64-linux-gnu/nvidia/current/libnvidia-ml.so root@zorn:~/slurm-23.

Re: [slurm-users] slurm sinfo format memory

2023-07-21 Thread Ole Holm Nielsen
Hi Arsene, On 7/20/23 18:24, Arsene Marian Alain wrote: > I would like to see the following information of my nodes "hostname, total > mem, free mem and cpus". So, I used  ‘sinfo -o "%8n %8m %8e %C"’ but in > the output it shows me the memory in MB like "190560" and I need it in GB > (without d

Re: [slurm-users] slurm sinfo format memory

2023-07-21 Thread Kevin Buckley
On 2023/07/21 00:24, Arsene Marian Alain wrote: I would like to see the following information of my nodes "hostname, total mem, free mem and cpus". So, I used 'sinfo -o "%8n %8m %8e %C"' but in the output it shows me the memory in MB like "190560" and I need it in GB (without decimals if poss