Re: [OMPI users] Hybrid OpenMPI+OpenMP tasks using SLURM

Ralph Castain Tue, 6 Oct 2015 15:35:46 -0400 (EDT)

I’ll have to fix it later this week - out due to eye surgery today. Looks like 
something didn’t get across to 1.10 as it should have. There are other 
tradeoffs that occur when you go to direct launch (e.g., loss of dynamics 
support) - may or may not be of concern to your usage.



> On Oct 6, 2015, at 11:57 AM, marcin.krotkiewski 
> <marcin.krotkiew...@gmail.com> wrote:
> 
> 
> Thanks, Gilles. This is a good suggestion and I will pursue this direction. 
> The problem is that currently SLURM does not support --cpu_bind on my system 
> for whatever reasons. I may work towards turning this option on if that will 
> be necessary, but it would also be good to be able to do it with pure 
> openmpi..
> 
> Marcin
> 
> 
> On 10/06/2015 08:01 AM, Gilles Gouaillardet wrote:
>> Marcin,
>> 
>> did you investigate direct launch (e.g. srun) instead of mpirun ?
>> 
>> for example, you can do
>> srun --ntasks=2 --cpus-per-task=4 -l grep Cpus_allowed_list /proc/self/status
>> 
>> note, you might have to use the srun --cpu_bind option, and make sure your 
>> slurm config does support that :
>> srun --ntasks=2 --cpus-per-task=4 --cpu_bind=core,verbose -l grep 
>> Cpus_allowed_list /proc/self/status
>> 
>> Cheers,
>> 
>> Gilles
>> 
>> On 10/6/2015 4:38 AM, marcin.krotkiewski wrote:
>>> Yet another question about cpu binding under SLURM environment..
>>> 
>>> Short version: will OpenMPI support SLURM_CPUS_PER_TASK for the purpose of 
>>> cpu binding?
>>> 
>>> 
>>> Full version: When you allocate a job like, e.g., this
>>> 
>>> salloc --ntasks=2 --cpus-per-task=4
>>> 
>>> SLURM will allocate 8 cores in total, 4 for each 'assumed' MPI tasks. This 
>>> is useful for hybrid jobs, where each MPI process spawns some internal 
>>> worker threads (e.g., OpenMP). The intention is that there are 2 MPI procs 
>>> started, each of them 'bound' to 4 cores. SLURM will also set an 
>>> environment variable
>>> 
>>> SLURM_CPUS_PER_TASK=4
>>> 
>>> which should (probably?) be taken into account by the method that launches 
>>> the MPI processes to figure out the cpuset. In case of OpenMPI + mpirun I 
>>> think something should happen in orte/mca/ras/slurm/ras_slurm_module.c, 
>>> where the variable _is_ actually parsed. Unfortunately, it is never really 
>>> used...
>>> 
>>> As a result, cpuset of all tasks started on a given compute node includes 
>>> all CPU cores of all MPI tasks on that node, just as provided by SLURM (in 
>>> the above example - 8). In general, there is no simple way for the user 
>>> code in the MPI procs to 'split' the cores between themselves. I imagine 
>>> the original intention to support this in OpenMPI was something like
>>> 
>>> mpirun --bind-to subtask_cpuset
>>> 
>>> with an artificial bind target that would cause OpenMPI to divide the 
>>> allocated cores between the mpi tasks. Is this right? If so, it seems that 
>>> at this point this is not implemented. Is there plans to do this? If no, 
>>> does anyone know another way to achieve that?
>>> 
>>> Thanks a lot!
>>> 
>>> Marcin
>>> 
>>> 
>>> 
>>> _______________________________________________
>>> users mailing list
>>> us...@open-mpi.org
>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>> Link to this post: 
>>> http://www.open-mpi.org/community/lists/users/2015/10/27803.php
>>> 
>> 
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>> Link to this post: 
>> http://www.open-mpi.org/community/lists/users/2015/10/27812.php
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2015/10/27818.php

Re: [OMPI users] Hybrid OpenMPI+OpenMP tasks using SLURM

Reply via email to