Hi Gilles,

Wow, thanks - that was quick. I'm rebuilding now.

Cheers,
Ben


-----Original Message-----
From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Gilles
Gouaillardet
Sent: Friday, 29 January 2016 1:54 PM
To: Open MPI Users <us...@open-mpi.org>
Subject: Re: [OMPI users] Any changes to rmaps in 1.10.2?

Ben,

here is a patch that does fix that

sorry for the inconvenience and thanks for your help in understanding 
this issue

Cheers,

Gilles

diff --git a/opal/mca/hwloc/base/hwloc_base_util.c 
b/opal/mca/hwloc/base/hwloc_base_util.c
index 237c6b0..a4fa193 100644
--- a/opal/mca/hwloc/base/hwloc_base_util.c
+++ b/opal/mca/hwloc/base/hwloc_base_util.c
@@ -492,8 +492,11 @@ static void df_search_cores(hwloc_obj_t obj, 
unsigned int *cnt)
              obj->userdata = (void*)data;
          }
          if (NULL == opal_hwloc_base_cpu_set) {
-            if (!hwloc_bitmap_isincluded(obj->cpuset, 
obj->allowed_cpuset)) {
-                /* do not count not allowed cores */
+            if (!hwloc_bitmap_intersects(obj->cpuset, 
obj->allowed_cpuset)) {
+                /*
+                 * do not count not allowed cores (e.g. cores with zero 
allowed PU)
+                 * if SMT is enabled, do count cores with at least one 
allowed hwthread
+                 */
                  return;
              }
              data->npus = 1;




On 1/29/2016 11:43 AM, Ben Menadue wrote:
> Yes, I'm able to reproduce it on a single node as well.
>
> Actually, even on just a single CPU (and -np 1) - won't let me launch
unless
> both threads of that core are in the cgroup.
>
>
> -----Original Message-----
> From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Gilles
> Gouaillardet
> Sent: Friday, 29 January 2016 1:33 PM
> To: Open MPI Users <us...@open-mpi.org>
> Subject: Re: [OMPI users] Any changes to rmaps in 1.10.2?
>
> I was able to reproduce the issue on one node with a cpuset manually set.
>
> fwiw, i cannot reproduce the issue using taskset instead of cpuset (!)
>
> Cheers,
>
> Gilles
>
> On 1/29/2016 11:08 AM, Ben Menadue wrote:
>> Hi Gilles, Ralph,
>>
>> Okay, it definitely seems to be due to the cpuset having only one of
>> the hyperthreads of each physical core:
>>
>>
>> [13:02:13 root@r60:4363542.r-man2] # echo 0-15 > cpuset.cpus
>>
>> 13:03 bjm900@r60 ~ > cat
>> /cgroup/cpuset/pbspro/4363542.r-man2/cpuset.cpus
>> 0-15
>>
>> 13:03 bjm900@r60 ~ > /apps/openmpi/1.10.2/bin/mpirun  hostname
>> ----------------------------------------------------------------------
>> ---- A request for multiple cpus-per-proc was given, but a directive
>> was also give to map to an object level that has less cpus than
>> requested ones:
>>
>>     #cpus-per-proc:  1
>>     number of cpus:  0
>>     map-by:          BYCORE:NOOVERSUBSCRIBE
>>
>> Please specify a mapping level that has more cpus, or else let us
>> define a default mapping that will allow multiple cpus-per-proc.
>> ----------------------------------------------------------------------
>> ----
>>
>> [13:03:43 root@r60:4363542.r-man2] # echo 0-31 > cpuset.cpus
>>
>> 13:03 bjm900@r60 ~ > cat
>> /cgroup/cpuset/pbspro/4363542.r-man2/cpuset.cpus
>> 0-31
>>
>> 13:04 bjm900@r60 ~ > /apps/openmpi/1.10.2/bin/mpirun  hostname
>> <...hostnames...>
>>
>>
>> Cheers,
>> Ben
>>
>>
>>
>> -----Original Message-----
>> From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Ben
>> Menadue
>> Sent: Friday, 29 January 2016 1:01 PM
>> To: 'Open MPI Users' <us...@open-mpi.org>
>> Subject: Re: [OMPI users] Any changes to rmaps in 1.10.2?
>>
>> Hi Gilles,
>>
>>> with respect to PBS, are both OpenMPI built the same way ?
>>> e.g. configure --with-tm=/opt/pbs/default or something similar
>> Both are built against TM explicitly using the --with-tm option.
>>
>>> you ran run
>>> mpirun --mca plm_base_verbose 100 --mca ess_base_verbose 100 --mca
>> ras_base_verbose 100 hostname
>>> and you should see the "tm" module in the logs.
>> Yes, it appears to use TM from what I can see. Outputs from 1.10.0 and
>> 1.10.2 are attached from inside the same job - they look identical
>> (apart from the pids), except at the very end where 1.10.2 errors out
>> while 1.10.1 continues.
>>
>>> i noticed you run
>>> mpirun -np 2 ...
>>> is there any reason why you explicitly request 2 tasks ?
>> The "-np 2" is because that's what I was using to benchmark the
>> install
>> (osu_bibw) and I just copied it over from when I realised it wasn't
>> even starting. But it does the same regardless of whether I specify
>> the number of processes or not (without it gets the number of tasks from
> PBS).
>>> by any chance, is hyperthreading enabled on your compute node ?
>>> /* if yes, that means all cores are in the cpuset, but with only one
>> thread per core */
>>
>> The nodes are 2 x 8-core sockets with hyper-threading on, and you can
>> chose whether to use the extra hardware threads when submitting the
>> job. If you want them, your cgroup includes both threads on each core
>> (e.g. 0-31), otherwise only one thread (e.g. 0-15) (cores 16-32 are
>> the thread siblings of cores 0-15).
>>
>> For reference, the PBS job I was using above had ncpus=32,mem=16G,
>> which becomes
>>
>>     select=2:ncpus=16:mpiprocs=16:mem=8589934592b
>>
>> under the hood with a cpuset containing cores 0-15 on each of the two
> nodes.
>> Interestingly, if I use a cpuset containing both threads of each
>> physical core (i.e. ask for hyperthreading on job submission), then it
>> runs fine under 1.10.2.
>>
>> Cheers,
>> Ben
>>
>>
>> -----Original Message-----
>> From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Gilles
>> Gouaillardet
>> Sent: Friday, 29 January 2016 11:07 AM
>> To: Open MPI Users <us...@open-mpi.org>
>> Subject: Re: [OMPI users] Any changes to rmaps in 1.10.2?
>>
>> Ben,
>>
>>
>>
>> that is not needed if you submit with qsub -l nodes=1:ppn=2 do you
>> observe the same behavior without -np 2 ?
>>
>>
>> Cheers,
>>
>> Gilles
>>
>> On 1/28/2016 7:57 AM, Ben Menadue wrote:
>>> Hi,
>>>
>>> Were there any changes to rmaps in going to 1.10.2? An
>>> otherwise-identical setup that worked in 1.10.0 fails to launch in
>>> 1.10.2, complaining that there's no CPUs available in a socket...
>>>
>>> With 1.10.0:
>>>
>>> $ /apps/openmpi/1.10.0/bin/mpirun -np 2 -mca rmaps_base_verbose 1000
>>> hostname [r47:18709] mca: base: components_register: registering
>>> rmaps components [r47:18709] mca: base: components_register: found
>>> loaded component resilient [r47:18709] mca: base: components_register:
>>> component resilient register function successful [r47:18709] mca:
>>> base: components_register: found loaded component rank_file
>>> [r47:18709] mca: base: components_register: component rank_file
>>> register function successful [r47:18709] mca: base:
>>> components_register: found loaded component staged [r47:18709] mca:
>>> base: components_register: component staged has no register or open
>>> function [r47:18709] mca: base: components_register: found loaded
>>> component ppr [r47:18709] mca: base: components_register: component
>>> ppr register function successful [r47:18709] mca: base:
>>> components_register: found loaded component seq [r47:18709] mca: base:
>>> components_register: component seq register function successful
>>> [r47:18709] mca: base: components_register: found loaded component
>>> round_robin [r47:18709] mca: base: components_register: component
>>> round_robin register function successful [r47:18709] mca: base:
>>> components_register: found loaded component mindist [r47:18709] mca:
>>> base: components_register: component mindist register function
>>> successful [r47:18709] [[63529,0],0] rmaps:base set policy with core
>>> [r47:18709] mca: base: components_open: opening rmaps components
>>> [r47:18709] mca: base: components_open: found loaded component
>>> resilient [r47:18709] mca: base: components_open: component resilient
>>> open function successful [r47:18709] mca: base: components_open:
>>> found loaded component rank_file [r47:18709] mca: base: components_open:
>>> component rank_file open function successful [r47:18709] mca: base:
>>> components_open: found loaded component staged [r47:18709] mca: base:
>>> components_open: component staged open function successful
>>> [r47:18709]
>>> mca: base: components_open: found loaded component ppr [r47:18709]
>>> mca: base: components_open: component ppr open function successful
>>> [r47:18709] mca: base: components_open: found loaded component seq
>>> [r47:18709] mca: base: components_open: component seq open function
>>> successful [r47:18709] mca: base: components_open: found loaded
>>> component round_robin [r47:18709] mca: base: components_open:
>>> component round_robin open function successful [r47:18709] mca: base:
>>> components_open: found loaded component mindist [r47:18709] mca: base:
>>> components_open: component mindist open function successful
>>> [r47:18709] mca:rmaps:select: checking available component resilient
>>> [r47:18709] mca:rmaps:select: Querying component [resilient]
>>> [r47:18709] mca:rmaps:select: checking available component rank_file
>>> [r47:18709] mca:rmaps:select: Querying component [rank_file]
>>> [r47:18709] mca:rmaps:select: checking available component staged
>>> [r47:18709] mca:rmaps:select: Querying component [staged] [r47:18709]
>>> mca:rmaps:select: checking available component ppr [r47:18709]
>>> mca:rmaps:select: Querying component [ppr] [r47:18709]
>>> mca:rmaps:select: checking available component seq [r47:18709]
>>> mca:rmaps:select: Querying component [seq] [r47:18709]
>>> mca:rmaps:select: checking available component round_robin
>>> [r47:18709]
>>> mca:rmaps:select: Querying component [round_robin] [r47:18709]
>>> mca:rmaps:select: checking available component mindist [r47:18709]
>>> mca:rmaps:select: Querying component [mindist] [r47:18709]
>>> [[63529,0],0]: Final mapper priorities
>>> [r47:18709]         Mapper: ppr Priority: 90
>>> [r47:18709]         Mapper: seq Priority: 60
>>> [r47:18709]         Mapper: resilient Priority: 40
>>> [r47:18709]         Mapper: mindist Priority: 20
>>> [r47:18709]         Mapper: round_robin Priority: 10
>>> [r47:18709]         Mapper: staged Priority: 5
>>> [r47:18709]         Mapper: rank_file Priority: 0
>>> [r47:18709] mca:rmaps: mapping job [63529,1] [r47:18709] mca:rmaps:
>>> creating new map for job [63529,1] [r47:18709] mca:rmaps: nprocs 2
>>> [r47:18709] mca:rmaps mapping given - using default [r47:18709]
>>> mca:rmaps:ppr: job [63529,1] not using ppr mapper [r47:18709]
>>> mca:rmaps:seq: job [63529,1] not using seq mapper [r47:18709]
>>> mca:rmaps:resilient: cannot perform initial map of job [63529,1]
>>> - no fault groups
>>> [r47:18709] mca:rmaps:mindist: job [63529,1] not using mindist mapper
>>> [r47:18709] mca:rmaps:rr: mapping job [63529,1] [r47:18709] AVAILABLE
>>> NODES FOR MAPPING:
>>> [r47:18709]     node: r47 daemon: 0
>>> [r47:18709]     node: r57 daemon: 1
>>> [r47:18709]     node: r58 daemon: 2
>>> [r47:18709]     node: r59 daemon: 3
>>> [r47:18709] mca:rmaps:rr: mapping no-span by Core for job [63529,1]
>>> slots 64 num_procs 2 [r47:18709] mca:rmaps:rr: found 16 Core objects
>>> on node r47 [r47:18709] mca:rmaps:rr: assigning proc to object 0
>>> [r47:18709] mca:rmaps:rr: assigning proc to object 1 [r47:18709]
>>> mca:rmaps: computing ranks by core for job [63529,1] [r47:18709]
>>> mca:rmaps:rank_by: found 16 objects on node r47 with 2 procs
>>> [r47:18709] mca:rmaps:rank_by: assigned rank 0 [r47:18709]
>>> mca:rmaps:rank_by: assigned rank 1 [r47:18709] mca:rmaps:rank_by:
>>> found 16 objects on node r57 with 0 procs [r47:18709]
>>> mca:rmaps:rank_by: found 16 objects on node r58 with 0 procs
>>> [r47:18709] mca:rmaps:rank_by: found 16 objects on node r59 with 0
>>> procs [r47:18709] mca:rmaps: compute bindings for job [63529,1] with
>>> policy CORE[4008] [r47:18709] mca:rmaps: bindings for job [63529,1] -
>>> bind in place [r47:18709] mca:rmaps: bind in place for job [63529,1]
>>> with bindings CORE [r47:18709] [[63529,0],0] reset_usage: node r47
>>> has
>>> 2 procs on it [r47:18709] [[63529,0],0] reset_usage: ignoring proc
>>> [[63529,1],0] [r47:18709] [[63529,0],0] reset_usage: ignoring proc
>>> [[63529,1],1] [r47:18709] BINDING PROC [[63529,1],0] TO Core NUMBER 0
>>> [r47:18709] [[63529,0],0] BOUND PROC [[63529,1],0] TO 0[Core:0] on
>>> node r47 [r47:18709] BINDING PROC [[63529,1],1] TO Core NUMBER 1
>>> [r47:18709] [[63529,0],0] BOUND PROC [[63529,1],1] TO 1[Core:1] on
>>> node r47
>>> r47
>>> r47
>>> [r47:18709] mca: base: close: component resilient closed [r47:18709]
>>> mca: base: close: unloading component resilient [r47:18709] mca: base:
>>> close: component rank_file closed [r47:18709] mca: base: close:
>>> unloading component rank_file [r47:18709] mca: base: close: component
>>> staged closed [r47:18709] mca: base: close: unloading component
>>> staged [r47:18709] mca: base: close: component ppr closed [r47:18709]
> mca:
>>> base: close: unloading component ppr [r47:18709] mca: base: close:
>>> component seq closed [r47:18709] mca: base: close: unloading
>>> component seq [r47:18709] mca: base: close: component round_robin
>>> closed [r47:18709] mca: base: close: unloading component round_robin
>>> [r47:18709] mca: base: close: component mindist closed [r47:18709]
>>> mca: base: close: unloading component mindist
>>>
>>> With 1.10.2:
>>>
>>> $ /apps/openmpi/1.10.2/bin/mpirun -np 2 -mca rmaps_base_verbose 1000
>>> hostname [r47:18733] mca: base: components_register: registering
>>> rmaps components [r47:18733] mca: base: components_register: found
>>> loaded component resilient [r47:18733] mca: base: components_register:
>>> component resilient register function successful [r47:18733] mca:
>>> base: components_register: found loaded component rank_file
>>> [r47:18733] mca: base: components_register: component rank_file
>>> register function successful [r47:18733] mca: base:
>>> components_register: found loaded component staged [r47:18733] mca:
>>> base: components_register: component staged has no register or open
>>> function [r47:18733] mca: base: components_register: found loaded
>>> component ppr [r47:18733] mca: base: components_register: component
>>> ppr register function successful [r47:18733] mca: base:
>>> components_register: found loaded component seq [r47:18733] mca: base:
>>> components_register: component seq register function successful
>>> [r47:18733] mca: base: components_register: found loaded component
>>> round_robin [r47:18733] mca: base: components_register: component
>>> round_robin register function successful [r47:18733] mca: base:
>>> components_register: found loaded component mindist [r47:18733] mca:
>>> base: components_register: component mindist register function
>>> successful [r47:18733] [[63505,0],0] rmaps:base set policy with core
>>> [r47:18733] mca: base: components_open: opening rmaps components
>>> [r47:18733] mca: base: components_open: found loaded component
>>> resilient [r47:18733] mca: base: components_open: component resilient
>>> open function successful [r47:18733] mca: base: components_open:
>>> found loaded component rank_file [r47:18733] mca: base: components_open:
>>> component rank_file open function successful [r47:18733] mca: base:
>>> components_open: found loaded component staged [r47:18733] mca: base:
>>> components_open: component staged open function successful
>>> [r47:18733]
>>> mca: base: components_open: found loaded component ppr [r47:18733]
>>> mca: base: components_open: component ppr open function successful
>>> [r47:18733] mca: base: components_open: found loaded component seq
>>> [r47:18733] mca: base: components_open: component seq open function
>>> successful [r47:18733] mca: base: components_open: found loaded
>>> component round_robin [r47:18733] mca: base: components_open:
>>> component round_robin open function successful [r47:18733] mca: base:
>>> components_open: found loaded component mindist [r47:18733] mca: base:
>>> components_open: component mindist open function successful
>>> [r47:18733] mca:rmaps:select: checking available component resilient
>>> [r47:18733] mca:rmaps:select: Querying component [resilient]
>>> [r47:18733] mca:rmaps:select: checking available component rank_file
>>> [r47:18733] mca:rmaps:select: Querying component [rank_file]
>>> [r47:18733] mca:rmaps:select: checking available component staged
>>> [r47:18733] mca:rmaps:select: Querying component [staged] [r47:18733]
>>> mca:rmaps:select: checking available component ppr [r47:18733]
>>> mca:rmaps:select: Querying component [ppr] [r47:18733]
>>> mca:rmaps:select: checking available component seq [r47:18733]
>>> mca:rmaps:select: Querying component [seq] [r47:18733]
>>> mca:rmaps:select: checking available component round_robin
>>> [r47:18733]
>>> mca:rmaps:select: Querying component [round_robin] [r47:18733]
>>> mca:rmaps:select: checking available component mindist [r47:18733]
>>> mca:rmaps:select: Querying component [mindist] [r47:18733]
>>> [[63505,0],0]: Final mapper priorities
>>> [r47:18733]         Mapper: ppr Priority: 90
>>> [r47:18733]         Mapper: seq Priority: 60
>>> [r47:18733]         Mapper: resilient Priority: 40
>>> [r47:18733]         Mapper: mindist Priority: 20
>>> [r47:18733]         Mapper: round_robin Priority: 10
>>> [r47:18733]         Mapper: staged Priority: 5
>>> [r47:18733]         Mapper: rank_file Priority: 0
>>> [r47:18733] mca:rmaps: mapping job [63505,1] [r47:18733] mca:rmaps:
>>> creating new map for job [63505,1] [r47:18733] mca:rmaps: nprocs 2
>>> [r47:18733] mca:rmaps mapping given - using default [r47:18733]
>>> mca:rmaps:ppr: job [63505,1] not using ppr mapper [r47:18733]
>>> mca:rmaps:seq: job [63505,1] not using seq mapper [r47:18733]
>>> mca:rmaps:resilient: cannot perform initial map of job [63505,1]
>>> - no fault groups
>>> [r47:18733] mca:rmaps:mindist: job [63505,1] not using mindist mapper
>>> [r47:18733] mca:rmaps:rr: mapping job [63505,1] [r47:18733] AVAILABLE
>>> NODES FOR MAPPING:
>>> [r47:18733]     node: r47 daemon: 0
>>> [r47:18733]     node: r57 daemon: 1
>>> [r47:18733]     node: r58 daemon: 2
>>> [r47:18733]     node: r59 daemon: 3
>>> [r47:18733] mca:rmaps:rr: mapping no-span by Core for job [63505,1]
>>> slots 64 num_procs 2 [r47:18733] mca:rmaps:rr: found 16 Core objects
>>> on node r47 [r47:18733] mca:rmaps:rr: assigning proc to object 0
>>> ---------------------------------------------------------------------
>>> -
>>> ---- A request for multiple cpus-per-proc was given, but a directive
>>> was also give to map to an object level that has less cpus than
>>> requested ones:
>>>
>>>      #cpus-per-proc:  1
>>>      number of cpus:  0
>>>      map-by:          BYCORE:NOOVERSUBSCRIBE
>>>
>>> Please specify a mapping level that has more cpus, or else let us
>>> define a default mapping that will allow multiple cpus-per-proc.
>>> ---------------------------------------------------------------------
>>> -
>>> ---- [r47:18733] mca: base: close: component resilient closed
>>> [r47:18733] mca: base: close: unloading component resilient
>>> [r47:18733] mca: base: close: component rank_file closed [r47:18733]
>>> mca: base: close: unloading component rank_file [r47:18733] mca: base:
>>> close: component staged closed [r47:18733] mca: base: close:
>>> unloading component staged [r47:18733] mca: base: close: component
>>> ppr closed [r47:18733] mca: base: close: unloading component ppr
> [r47:18733] mca:
>>> base: close: component seq closed [r47:18733] mca: base: close:
>>> unloading component seq [r47:18733] mca: base: close: component
>>> round_robin closed [r47:18733] mca: base: close: unloading component
>>> round_robin [r47:18733] mca: base: close: component mindist closed
>>> [r47:18733] mca: base: close: unloading component mindist
>>>
>>> There are both in the same PBS Pro job. And the cpuset definitely has
>>> all cores available:
>>>
>>> $ cat /cgroup/cpuset/pbspro/4347646.r-man2/cpuset.cpus
>>> 0-15
>>>
>>> Is there something here I'm missing?
>>>
>>> Cheers,
>>> Ben
>>>
>>>
>>> _______________________________________________
>>> users mailing list
>>> us...@open-mpi.org
>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>> Link to this post:
>>> http://www.open-mpi.org/community/lists/users/2016/01/28393.php
>>>
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>> Link to this post:
>> http://www.open-mpi.org/community/lists/users/2016/01/28400.php
>>
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>> Link to this post:
>> http://www.open-mpi.org/community/lists/users/2016/01/28402.php
>>
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post:
> http://www.open-mpi.org/community/lists/users/2016/01/28404.php
>
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post:
http://www.open-mpi.org/community/lists/users/2016/01/28405.php
>

_______________________________________________
users mailing list
us...@open-mpi.org
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post:
http://www.open-mpi.org/community/lists/users/2016/01/28406.php

Reply via email to