Ralph,

my guess is that cupset is set by the batch manager (slurm?)
so I think this is an ompi bug/missing feature :
"we" should check the available cores (4 in this case because of cpuset)
instead of the online cores (8 in this case)
I wrote "we" because it could either be ompi or hwloc, or ompi should ask
the correct info to hwloc if it is available in hwloc.

makes sense ?

Brice, can you please comment on hwloc and cpuset ?

Cheers,

Gilles


On Wednesday, September 16, 2015, Ralph Castain <r...@open-mpi.org> wrote:

> Not precisely correct. It depends on the environment.
>
> If there is a resource manager allocating nodes, or you provide a hostfile
> that specifies the number of slots on the nodes, or you use -host, then we
> default to no-oversubscribe.
>
> If you provide a hostfile that doesn’t specify slots, then we use the
> number of cores we find on each node, and we allow oversubscription.
>
> What is being described sounds like more of a bug than an intended
> feature. I’d need to know more about it, though, to be sure. Can you tell
> me how you are specifying this cpuset?
>
>
> On Sep 15, 2015, at 4:44 PM, Matt Thompson <fort...@gmail.com
> <javascript:_e(%7B%7D,'cvml','fort...@gmail.com');>> wrote:
>
> Looking at the Open MPI 1.10.0 man page:
>
>   https://www.open-mpi.org/doc/v1.10/man1/mpirun.1.php
>
> it looks like perhaps -oversubscribe (which was an option) is now the
> default behavior. Instead we have:
>
> *-nooversubscribe, --nooversubscribe*Do not oversubscribe any nodes;
> error (without starting any processes) if the requested number of processes
> would cause oversubscription. This option implicitly sets "max_slots" equal
> to the "slots" value for each node.
>
> It also looks like -map-by has a way to implement it as well (see man
> page).
>
> Thanks for letting me/us know about this. On a system of mine I sort of
> depend on the -nooversubscribe behavior!
>
> Matt
>
>
>
> On Tue, Sep 15, 2015 at 11:17 AM, Patrick Begou <
> patrick.be...@legi.grenoble-inp.fr
> <javascript:_e(%7B%7D,'cvml','patrick.be...@legi.grenoble-inp.fr');>>
> wrote:
>
>> Hi,
>>
>> I'm runing OpenMPI 1.10.0 built with Intel 2015 compilers on a Bullx
>> System.
>> I've some troubles with the bind-to core option when using cpuset.
>> If the cpuset is less than all the cores of a cpu (ex: 4 cores allowed on
>> a 8 cores cpus) OpenMPI 1.10.0 allows to overload these cores  until the
>> maximum number of cores of the cpu.
>> With this config and because the cpuset only allows 4 cores, I can reach
>> 2 processes/core if I use:
>>
>> mpirun -np 8 --bind-to core my_application
>>
>> OpenMPI 1.7.3 doesn't show the problem with the same situation:
>> mpirun -np 8 --bind-to-core my_application
>> returns:
>> *A request was made to bind to that would result in binding more*
>> *processes than cpus on a resource*
>> and that's okay of course.
>>
>>
>> Is there a way to avoid this oveloading with OpenMPI 1.10.0 ?
>>
>> Thanks
>>
>> Patrick
>>
>> --
>> ===================================================================
>> |  Equipe M.O.S.T.         |                                      |
>> |  Patrick BEGOU           | mailto:patrick.be...@grenoble-inp.fr 
>> <javascript:_e(%7B%7D,'cvml','patrick.be...@grenoble-inp.fr');> |
>> |  LEGI                    |                                      |
>> |  BP 53 X                 | Tel 04 76 82 51 35                   |
>> |  38041 GRENOBLE CEDEX    | Fax 04 76 82 52 71                   |
>> ===================================================================
>>
>>
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org <javascript:_e(%7B%7D,'cvml','us...@open-mpi.org');>
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>> Link to this post:
>> http://www.open-mpi.org/community/lists/users/2015/09/27575.php
>>
>
>
>
> --
> Matt Thompson
>
> Man Among Men
> Fulcrum of History
>
> _______________________________________________
> users mailing list
> us...@open-mpi.org <javascript:_e(%7B%7D,'cvml','us...@open-mpi.org');>
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post:
> http://www.open-mpi.org/community/lists/users/2015/09/27579.php
>
>
>

Reply via email to