Re: [OMPI users] Hybrid OpenMPI+OpenMP tasks using SLURM

marcin.krotkiewski Tue, 6 Oct 2015 14:57:13 -0400 (EDT)

Thanks, Gilles. This is a good suggestion and I will pursue thisdirection. The problem is that currently SLURM does not support--cpu_bind on my system for whatever reasons. I may work towards turningthis option on if that will be necessary, but it would also be good tobe able to do it with pure openmpi..


Marcin


On 10/06/2015 08:01 AM, Gilles Gouaillardet wrote:

Marcin,

did you investigate direct launch (e.g. srun) instead of mpirun ?

for example, you can do
srun --ntasks=2 --cpus-per-task=4 -l grep Cpus_allowed_list/proc/self/status
note, you might have to use the srun --cpu_bind option, and make sureyour slurm config does support that :srun --ntasks=2 --cpus-per-task=4 --cpu_bind=core,verbose -l grepCpus_allowed_list /proc/self/status
Cheers,

Gilles

On 10/6/2015 4:38 AM, marcin.krotkiewski wrote:
Yet another question about cpu binding under SLURM environment..
Short version: will OpenMPI support SLURM_CPUS_PER_TASK for thepurpose of cpu binding?
Full version: When you allocate a job like, e.g., this

salloc --ntasks=2 --cpus-per-task=4
SLURM will allocate 8 cores in total, 4 for each 'assumed' MPI tasks.This is useful for hybrid jobs, where each MPI process spawns someinternal worker threads (e.g., OpenMP). The intention is that thereare 2 MPI procs started, each of them 'bound' to 4 cores. SLURM willalso set an environment variable
SLURM_CPUS_PER_TASK=4
which should (probably?) be taken into account by the method thatlaunches the MPI processes to figure out the cpuset. In case ofOpenMPI + mpirun I think something should happen inorte/mca/ras/slurm/ras_slurm_module.c, where the variable _is_actually parsed. Unfortunately, it is never really used...
As a result, cpuset of all tasks started on a given compute nodeincludes all CPU cores of all MPI tasks on that node, just asprovided by SLURM (in the above example - 8). In general, there is nosimple way for the user code in the MPI procs to 'split' the coresbetween themselves. I imagine the original intention to support thisin OpenMPI was something like
mpirun --bind-to subtask_cpuset
with an artificial bind target that would cause OpenMPI to divide theallocated cores between the mpi tasks. Is this right? If so, it seemsthat at this point this is not implemented. Is there plans to dothis? If no, does anyone know another way to achieve that?
Thanks a lot!

Marcin



_______________________________________________
users mailing list
us...@open-mpi.org
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post:http://www.open-mpi.org/community/lists/users/2015/10/27803.php
_______________________________________________
users mailing list
us...@open-mpi.org
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post:http://www.open-mpi.org/community/lists/users/2015/10/27812.php

Re: [OMPI users] Hybrid OpenMPI+OpenMP tasks using SLURM

Reply via email to