Hi Davide
Thanks for your feedback

If  gpu01 and cpusingpu01 are physically the same node, doesn't this mean
that I have to start 2 slurmd on that node (one with "slurmd -N gpu01" and
one with "slurmd -N cpusingpu01") ?


Thanks, Massimo


On Mon, Mar 31, 2025 at 3:22 PM Davide DelVento <davide.quan...@gmail.com>
wrote:

> Ciao Massimo,
> How about creating another queue cpus_in_the_gpu_nodes (or something less
> silly) which targets the GPU nodes but does not allow the allocation of the
> GPUs with gres and allocates 96-8 (or whatever other number you deem
> appropriate) of the CPUs (and similarly with memory)? Actually it could
> even be the same "onlycpus" queue, just on different nodes.
>
> In fact, in Slurm you declare the cores (and sockets) in a node-based, not
> queue-based, fashion. But you can set up an alias for those nodes with a
> second name and use such a second name in the way described above. I am not
> aware (and I have not searched for) Slurm be able to understand such a
> situation on its own and therefore you will have to manually avoid "double
> booking". One way of doing that could be to configure the nodes with their
> first name in a way that Slurm thinks they have less resources. So for
> example in slurm.conf
>
> NodeName=gpu[01-06] CoresPerSocket=4 RealMemory=whatever1 Sockets=2
> ThreadsPerCore=1 Weight=10000 State=UNKNOWN Gres=gpu:h100:4
> NodeName=cpusingpu[01-06] CoresPerSocket=44 RealMemory=whatever2 Sockets=2
> ThreadsPerCore=1 Weight=10000 State=UNKNOWN
>
> where gpuNN and cpusingpuNN are physically the same node and whatever1 +
> whatever2 is the actual maximum amount of memory you want Slurm to
> allocate. And you will also want to make sure the Weight are such that the
> non-GPU nodes get used first.
>
> Disclaimer: I'm thinking out loud, I have not tested this in practice,
> there may be something I overlooked.
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> On Mon, Mar 31, 2025 at 5:12 AM Massimo Sgaravatto via slurm-users <
> slurm-users@lists.schedmd.com> wrote:
>
>> Dear all
>>
>>
>>
>> We have just installed a small SLURM cluster composed of 12 nodes:
>>
>> - 6 CPU only nodes: 2 Sockets=2, 96 CoresPerSocket 2, ThreadsPerCore=2,
>> 1.5 TB of RAM
>> - 6 nodes with also GPUS: same conf of the CPU-only node + 4 H100 per node
>>
>>
>> We started with a setup with 2 partitions:
>>
>> - a 'onlycpus' partition which sees all the cpu-only nodes
>> - a 'gpus' partition which sees the nodes with gpus
>>
>> and asked users to use the 'gpus' partition only for jobs that need gpus
>> (for the time being we are not technically enforced that).
>>
>>
>> The problem is that a job requiring a GPU usually needs only a few cores
>> and only a few GB of RAM, which means wasting a lot of CPU cores.
>> And having all nodes in the same partition would mean that there is the
>> risk that a job requiring a GPU can't start if all CPU cores and/or all
>> memory is used by CPU only jobs
>>
>>
>> I went through the mailing list archive and I think that "splitting" a
>> GPU node into two logical nodes (one to be used in the 'gpus' partition and
>> one to be used in the 'onlycpus' partition) as discussed in [*] would help.
>>
>>
>> Since that proposed solution is considered by his author a "bit of a
>> kludge" and since I read that splitting a node into multiple logical nodes
>> is in a general a bad idea, I'd like to understand if you could suggest
>> other/best options.
>>
>>
>> I also found this [**] thread, but I don't like too much that approach
>> (i.e. relying on MaxCPUsPerNode) because it would mean having 3 partition
>> (if I have got it right): two partitions for cpu only jobs and 1 partition
>> for gpu jobs
>>
>>
>> Many thanks, Massimo
>>
>>
>> [*] https://groups.google.com/g/slurm-users/c/IUd7jLKME3M
>> [**] https://groups.google.com/g/slurm-users/c/o7AiYAQ1YJ0
>>
>> --
>> slurm-users mailing list -- slurm-users@lists.schedmd.com
>> To unsubscribe send an email to slurm-users-le...@lists.schedmd.com
>>
>
-- 
slurm-users mailing list -- slurm-users@lists.schedmd.com
To unsubscribe send an email to slurm-users-le...@lists.schedmd.com

Reply via email to