[slurm-users] Re: Error "_check_core_range_matches_sock" when starting "slurmctld"

2025-04-28 Thread Laura Hild via slurm-users
> Socket(s): 1 > NUMA node(s):1 > [...] > NodeName=mysystem Autodetect=off Name=gpu Type=geforce_gtx_titan_x > File=/dev/nvidia0 CPUs=0-1 > NodeName=mysystem Autodetect=off Name=gpu Type=geforce_gtx_titan_black > File=/dev/nvidia1 CPUs=2-3 What do you intend to achieve with CPU

[slurm-users] Re: Can't specify multiple partitions when submitting GPU jobs

2025-04-28 Thread milad--- via slurm-users
Update: I also noticed that specifying -ntasks makes a difference when --gpus is present. if I have two partitions a100 and h100 that both have free GPUs: ✅ h100 specified first in -p: works sbatch -p h100,a100 --gpus h100:1 script.sh ❌ h100 specified second: doesn't work sbatch -p a100,h100 --

[slurm-users] Can't specify multiple partitions when submitting GPU jobs

2025-04-28 Thread milad--- via slurm-users
Hi, I have defined a partition for each GPU type we have in the cluster. This was mainly because I have different Node types for each GPU type and I want to set `DefCpuPerGPU` `DefMemPerGPU` for each of them. Unfortunately one can't set them per node but can do that per partition. Now sometime

[slurm-users] Error "_check_core_range_matches_sock" when starting "slurmctld"

2025-04-28 Thread Gestió Servidors via slurm-users
Hello, I have compiled SLURM-24.11.3 and I have configured two GPUs in my system (slurmctld and slurmd are running in the same computer). Computes has a old processor Intel i7 with 4 cores and 4 hyperthreading. Node is configured with "NodeName=mysystem CPUs=8 Boards=1 SocketsPerBoard=1 CoresPe