Thank you for this note, the comments on the issue you raised have saved
me a lot of time.

I agree that the documentation around this issue of cpu
assignment/allocation is confusing and part of the problem.    Of
particular concern is the changing behavior of cpu allocation when
adding a gres that has nothing to do with cpu assignment.  The
non-deterministic behavior here is unsettling.

I guess I will need to have  a look at a plugin so that we can ensure
deterministic behavior.   

We have already modified the slurm source to allow regular users to
create there own reservations, so perhaps I might get annoyed enough to
'fix' this in the slurm code base.

Thank you!

Alan

On 6/7/24 18:36, Juergen Salk wrote:
> Hi Alan,
>
> unfortunately, process placement in Slurm is kind of black magic for
> sub-node jobs, i.e. jobs that allocate only a small number of CPUs of
> a node. 
>
> I have recently raised a similar question here:
>
>  https://support.schedmd.com/show_bug.cgi?id=19236
>
> And the buttom line was, that to "really have control over task placement 
> you really have to allocate the node in --exclusive manner". 
>
> Best regards
> Jürgen
>
>
> * Alan Stange via slurm-users <slurm-users@lists.schedmd.com> [240607 14:52]:
>> All,
>>
>> I have a very simple slurm cluster.  It's just a single system with 2
>> sockets and 16 cores in each socket.  I would like to be able to submit
>> a simple task into this cluster, and to have the cpus assigned to that
>> task allocated round robin across the two sockets.   Everything I try is
>> putting all the cpus for this single task on the same socket.
>>
>> I have not specified any CpuBind options in the slurm.conf file.   For
>> example, if I try
>>
>> $ srun -c 4 --pty bash
>>
>> I get a shell prompt on the system, and can run
>>
>> $ taskset -cp $$
>> pid 12345 current affinity list: 0,2,4,6
>>
>> and I get this same set of cpus no matter what options I try (the
>> cluster is idle with no tasks consuming slots).
>>
>> I've tried various srun command line options like:
>> --hint=compute_bound
>> --hint=memory_bound
>> various --cpubind options
>> -B 2:2 -m block:cyclic and block:fcyclic
>>
>> Note that if I try to allocation 17 cpus, then I do get the 17th cpu
>> allocated on the 2nd socket.
>>
>>
>> What magic incantation is needed to get an allocation where the cpus are
>> chosen round robin across the sockets?
>>
>> Thank you!
>>
>> Alan
>>
>>
>> -- 
>> slurm-users mailing list -- slurm-users@lists.schedmd.com
>> To unsubscribe send an email to slurm-users-le...@lists.schedmd.com
> -- 
> Jürgen Salk
> Scientific Software & Compute Services (SSCS)
> Kommunikations- und Informationszentrum (kiz)
> Universität Ulm
> Telefon: +49 (0)731 50-22478
> Telefax: +49 (0)731 50-22471


-- 
slurm-users mailing list -- slurm-users@lists.schedmd.com
To unsubscribe send an email to slurm-users-le...@lists.schedmd.com
  • [slurm-users] ... Alan Stange via slurm-users
    • [slurm-us... Juergen Salk via slurm-users
      • [slur... Williams, Gareth (IM&T, Black Mountain) via slurm-users
      • [slur... Alan Stange via slurm-users

Reply via email to