Hi Ralph, I have tested your fix - 30895. I'm afraid to say I found a mistake.
You should include "SETTING BIND_TO_NONE" in the above if-clause at the line 74, 256, 511, 656. Othrewise, just warning message disappears but binding to core is still overwritten by binding to none. Pleaes see attached patch. (See attached file: patch_from_30895) Tetsuya > Hi Ralph, I understood what you meant. > > I often use float for our applicatoin. > float c = (float)(unsinged int a - unsinged int b) could > be very huge number, if a < b. So I always carefully cast to > int from unsigned int when I subtract them. I didn't know/mind > inc d = (unsinged int a - unsinged int b) has no problem. > I noticed it by your suggestion, thanks. > > Therefore, I think my fix is not necesarry. > > Tetsuya > > > > Yes, indeed. In future, when we will have many many cores > > in the machine, we will have to take care of overrun of > > num_procs. > > > > Tetsuya > > > > > Cool - easily modified. Thanks! > > > > > > Of course, you understand (I'm sure) that the cast does nothing to > > protect the code from blowing up if we overrun the var. In other words, > if > > the unsigned var has wrapped, then casting it to int > > > won't help - you'll still get a negative integer, and the code will > > trash. > > > > > > > > > On Feb 28, 2014, at 3:43 PM, tmish...@jcity.maeda.co.jp wrote: > > > > > > > > > > > > > > > Hi Ralph, I'm a litte bit late to your release. > > > > > > > > I found a minor mistake in byobj_span -integer casting problem. > > > > > > > > --- rmaps_rr_mappers.30892.c 2014-03-01 08:31:50 +0900 > > > > +++ rmaps_rr_mappers.c 2014-03-01 08:33:22 +0900 > > > > @@ -689,7 +689,7 @@ > > > > } > > > > > > > > /* compute how many objs need an extra proc */ > > > > - if (0 > (nxtra_objs = app->num_procs - (navg * nobjs))) { > > > > + if (0 > (nxtra_objs = (int)app->num_procs - (navg * > (int)nobjs))) > > { > > > > nxtra_objs = 0; > > > > } > > > > > > > > Tetsuya > > > > > > > >> Please take a look at > https://svn.open-mpi.org/trac/ompi/ticket/4317 > > > >> > > > >> > > > >> On Feb 27, 2014, at 8:13 PM, tmish...@jcity.maeda.co.jp wrote: > > > >> > > > >>> > > > >>> > > > >>> Hi Ralph, I can't operate our cluster for a few days, sorry. > > > >>> > > > >>> But now, I'm narrowing down the cause by browsing the source code. > > > >>> > > > >>> My best guess is the line 529. The opal_hwloc_base_get_obj_by_type > > will > > > >>> reset the object pointer to the first one when you move on to the > > next > > > >>> node. > > > >>> > > > >>> 529 if (NULL == (obj = > > > > opal_hwloc_base_get_obj_by_type > > > >>> (node->topology, target, cache_level, i, OPAL_HWLOC_AVAILABLE))) { > > > >>> 530 ORTE_ERROR_LOG(ORTE_ERR_NOT_FOUND); > > > >>> 531 return ORTE_ERR_NOT_FOUND; > > > >>> 532 } > > > >>> > > > >>> if node->slots=1, then nprocs is set as nprocs=1 in the second > pass: > > > >>> > > > >>> 495 nprocs = (node->slots - node->slots_inuse) / > > > >>> orte_rmaps_base.cpus_per_rank; > > > >>> 496 if (nprocs < 1) { > > > >>> 497 if (second_pass) { > > > >>> 498 /* already checked for oversubscription > > > > permission, > > > >>> so at least put > > > >>> 499 * one proc on it > > > >>> 500 */ > > > >>> 501 nprocs = 1; > > > >>> > > > >>> Therefore, opal_hwloc_base_get_obj_by_type is called one by one at > > each > > > >>> node, which means > > > >>> the object we get is always first one. > > > >>> > > > >>> It's not elegant but I guess you need dummy calls of > > > >>> opal_hwloc_base_get_obj_by_type to > > > >>> move the object pointer to the right place or modify > > > >>> opal_hwloc_base_get_obj_by_type itself. > > > >>> > > > >>> Tetsuya > > > >>> > > > >>>> I'm having trouble seeing why it is failing, so I added some more > > > > debug > > > >>> output. Could you run the failure case again with -mca > > > > rmaps_base_verbose > > > >>> 10? > > > >>>> > > > >>>> Thanks > > > >>>> Ralph > > > >>>> > > > >>>> On Feb 27, 2014, at 6:11 PM, tmish...@jcity.maeda.co.jp wrote: > > > >>>> > > > >>>>> > > > >>>>> > > > >>>>> Just checking the difference, not so significant meaning... > > > >>>>> > > > >>>>> Anyway, I guess it's due to the behavior when slot counts is > > missing > > > >>>>> (regarded as slots=1) and it's oversubscribed unintentionally. > > > >>>>> > > > >>>>> I'm going out now, so I can't verify it quickly. If I provide the > > > >>>>> correct slot counts, it wll work, I guess. How do you think? > > > >>>>> > > > >>>>> Tetsuya > > > >>>>> > > > >>>>>> "restore" in what sense? > > > >>>>>> > > > >>>>>> On Feb 27, 2014, at 4:10 PM, tmish...@jcity.maeda.co.jp wrote: > > > >>>>>> > > > >>>>>>> > > > >>>>>>> > > > >>>>>>> Hi Ralph, this is just for your information. > > > >>>>>>> > > > >>>>>>> I tried to restore previous orte_rmaps_rr_byobj. Then I gets > the > > > >>> result > > > >>>>>>> below with this command line: > > > >>>>>>> > > > >>>>>>> mpirun -np 8 -host node05,node06 -report-bindings -map-by > > > > socket:pe=2 > > > >>>>>>> -display-map -bind-to core:overload-allowed > > > >>> ~/mis/openmpi/demos/myprog > > > >>>>>>> Data for JOB [31184,1] offset 0 > > > >>>>>>> > > > >>>>>>> ======================== JOB MAP ======================== > > > >>>>>>> > > > >>>>>>> Data for node: node05 Num slots: 1 Max slots: 0 Num > procs: > > 7 > > > >>>>>>> Process OMPI jobid: [31184,1] App: 0 Process rank: 0 > > > >>>>>>> Process OMPI jobid: [31184,1] App: 0 Process rank: 2 > > > >>>>>>> Process OMPI jobid: [31184,1] App: 0 Process rank: 4 > > > >>>>>>> Process OMPI jobid: [31184,1] App: 0 Process rank: 6 > > > >>>>>>> Process OMPI jobid: [31184,1] App: 0 Process rank: 1 > > > >>>>>>> Process OMPI jobid: [31184,1] App: 0 Process rank: 3 > > > >>>>>>> Process OMPI jobid: [31184,1] App: 0 Process rank: 5 > > > >>>>>>> > > > >>>>>>> Data for node: node06 Num slots: 1 Max slots: 0 Num > procs: > > 1 > > > >>>>>>> Process OMPI jobid: [31184,1] App: 0 Process rank: 7 > > > >>>>>>> > > > >>>>>>> ============================================================= > > > >>>>>>> [node06.cluster:18857] MCW rank 7 bound to socket 0[core 0 [hwt > > 0]], > > > >>>>> socket > > > >>>>>>> 0[core 1[hwt 0]]: [B/B/./.][./././.] > > > >>>>>>> [node05.cluster:21399] MCW rank 3 bound to socket 1[core 6 [hwt > > 0]], > > > >>>>> socket > > > >>>>>>> 1[core 7[hwt 0]]: [./././.][././B/B] > > > >>>>>>> [node05.cluster:21399] MCW rank 4 bound to socket 0[core 0 [hwt > > 0]], > > > >>>>> socket > > > >>>>>>> 0[core 1[hwt 0]]: [B/B/./.][./././.] > > > >>>>>>> [node05.cluster:21399] MCW rank 5 bound to socket 1[core 4 [hwt > > 0]], > > > >>>>> socket > > > >>>>>>> 1[core 5[hwt 0]]: [./././.][B/B/./.] > > > >>>>>>> [node05.cluster:21399] MCW rank 6 bound to socket 0[core 2 [hwt > > 0]], > > > >>>>> socket > > > >>>>>>> 0[core 3[hwt 0]]: [././B/B][./././.] > > > >>>>>>> [node05.cluster:21399] MCW rank 0 bound to socket 0[core 0 [hwt > > 0]], > > > >>>>> socket > > > >>>>>>> 0[core 1[hwt 0]]: [B/B/./.][./././.] > > > >>>>>>> [node05.cluster:21399] MCW rank 1 bound to socket 1[core 4 [hwt > > 0]], > > > >>>>> socket > > > >>>>>>> 1[core 5[hwt 0]]: [./././.][B/B/./.] > > > >>>>>>> [node05.cluster:21399] MCW rank 2 bound to socket 0[core 2 [hwt > > 0]], > > > >>>>> socket > > > >>>>>>> 0[core 3[hwt 0]]: [././B/B][./././.] > > > >>>>>>> .... > > > >>>>>>> > > > >>>>>>> > > > >>>>>>> Then I add "-hostfile pbs_hosts" and the result is: > > > >>>>>>> > > > >>>>>>> [mishima@manage work]$cat pbs_hosts > > > >>>>>>> node05 slots=8 > > > >>>>>>> node06 slots=8 > > > >>>>>>> [mishima@manage work]$ mpirun -np 8 -hostfile ~/work/pbs_hosts > > > >>>>>>> -report-bindings -map-by socket:pe=2 -display-map > > > >>>>>>> ~/mis/openmpi/demos/myprog > > > >>>>>>> Data for JOB [30254,1] offset 0 > > > >>>>>>> > > > >>>>>>> ======================== JOB MAP ======================== > > > >>>>>>> > > > >>>>>>> Data for node: node05 Num slots: 8 Max slots: 0 Num > procs: > > 4 > > > >>>>>>> Process OMPI jobid: [30254,1] App: 0 Process rank: 0 > > > >>>>>>> Process OMPI jobid: [30254,1] App: 0 Process rank: 2 > > > >>>>>>> Process OMPI jobid: [30254,1] App: 0 Process rank: 1 > > > >>>>>>> Process OMPI jobid: [30254,1] App: 0 Process rank: 3 > > > >>>>>>> > > > >>>>>>> Data for node: node06 Num slots: 8 Max slots: 0 Num > procs: > > 4 > > > >>>>>>> Process OMPI jobid: [30254,1] App: 0 Process rank: 4 > > > >>>>>>> Process OMPI jobid: [30254,1] App: 0 Process rank: 6 > > > >>>>>>> Process OMPI jobid: [30254,1] App: 0 Process rank: 5 > > > >>>>>>> Process OMPI jobid: [30254,1] App: 0 Process rank: 7 > > > >>>>>>> > > > >>>>>>> ============================================================= > > > >>>>>>> [node05.cluster:21501] MCW rank 2 bound to socket 0[core 2 [hwt > > 0]], > > > >>>>> socket > > > >>>>>>> 0[core 3[hwt 0]]: [././B/B][./././.] > > > >>>>>>> [node05.cluster:21501] MCW rank 3 bound to socket 1[core 6 [hwt > > 0]], > > > >>>>> socket > > > >>>>>>> 1[core 7[hwt 0]]: [./././.][././B/B] > > > >>>>>>> [node05.cluster:21501] MCW rank 0 bound to socket 0[core 0 [hwt > > 0]], > > > >>>>> socket > > > >>>>>>> 0[core 1[hwt 0]]: [B/B/./.][./././.] > > > >>>>>>> [node05.cluster:21501] MCW rank 1 bound to socket 1[core 4 [hwt > > 0]], > > > >>>>> socket > > > >>>>>>> 1[core 5[hwt 0]]: [./././.][B/B/./.] > > > >>>>>>> [node06.cluster:18935] MCW rank 6 bound to socket 0[core 2 [hwt > > 0]], > > > >>>>> socket > > > >>>>>>> 0[core 3[hwt 0]]: [././B/B][./././.] > > > >>>>>>> [node06.cluster:18935] MCW rank 7 bound to socket 1[core 6 [hwt > > 0]], > > > >>>>> socket > > > >>>>>>> 1[core 7[hwt 0]]: [./././.][././B/B] > > > >>>>>>> [node06.cluster:18935] MCW rank 4 bound to socket 0[core 0 [hwt > > 0]], > > > >>>>> socket > > > >>>>>>> 0[core 1[hwt 0]]: [B/B/./.][./././.] > > > >>>>>>> [node06.cluster:18935] MCW rank 5 bound to socket 1[core 4 [hwt > > 0]], > > > >>>>> socket > > > >>>>>>> 1[core 5[hwt 0]]: [./././.][B/B/./.] > > > >>>>>>> .... > > > >>>>>>> > > > >>>>>>> > > > >>>>>>> I think previous version's behavior would be close to what I > > > > expect. > > > >>>>>>> > > > >>>>>>> Tetusya > > > >>>>>>> > > > >>>>>>>> They have 4 cores/socket and 2 sockets, totally 4 X 2 = 8 > cores, > > > >>> each. > > > >>>>>>>> > > > >>>>>>>> Here is the output of lstopo. > > > >>>>>>>> > > > >>>>>>>> mishima@manage round_robin]$ rsh node05 > > > >>>>>>>> Last login: Tue Feb 18 15:10:15 from manage > > > >>>>>>>> [mishima@node05 ~]$ lstopo > > > >>>>>>>> Machine (32GB) > > > >>>>>>>> NUMANode L#0 (P#0 16GB) + Socket L#0 + L3 L#0 (6144KB) > > > >>>>>>>> L2 L#0 (512KB) + L1d L#0 (64KB) + L1i L#0 (64KB) + Core L#0 + > PU > > > > L#0 > > > >>>>>>>> (P#0) > > > >>>>>>>> L2 L#1 (512KB) + L1d L#1 (64KB) + L1i L#1 (64KB) + Core L#1 + > PU > > > > L#1 > > > >>>>>>>> (P#1) > > > >>>>>>>> L2 L#2 (512KB) + L1d L#2 (64KB) + L1i L#2 (64KB) + Core L#2 + > PU > > > > L#2 > > > >>>>>>>> (P#2) > > > >>>>>>>> L2 L#3 (512KB) + L1d L#3 (64KB) + L1i L#3 (64KB) + Core L#3 + > PU > > > > L#3 > > > >>>>>>>> (P#3) > > > >>>>>>>> NUMANode L#1 (P#1 16GB) + Socket L#1 + L3 L#1 (6144KB) > > > >>>>>>>> L2 L#4 (512KB) + L1d L#4 (64KB) + L1i L#4 (64KB) + Core L#4 + > PU > > > > L#4 > > > >>>>>>>> (P#4) > > > >>>>>>>> L2 L#5 (512KB) + L1d L#5 (64KB) + L1i L#5 (64KB) + Core L#5 + > PU > > > > L#5 > > > >>>>>>>> (P#5) > > > >>>>>>>> L2 L#6 (512KB) + L1d L#6 (64KB) + L1i L#6 (64KB) + Core L#6 + > PU > > > > L#6 > > > >>>>>>>> (P#6) > > > >>>>>>>> L2 L#7 (512KB) + L1d L#7 (64KB) + L1i L#7 (64KB) + Core L#7 + > PU > > > > L#7 > > > >>>>>>>> (P#7) > > > >>>>>>>> .... > > > >>>>>>>> > > > >>>>>>>> I foucused on byobj_span and bynode. I didn't notice byobj was > > > >>>>> modified, > > > >>>>>>>> sorry. > > > >>>>>>>> > > > >>>>>>>> Tetsuya > > > >>>>>>>> > > > >>>>>>>>> Hmmm..what does your node look like again (sockets and > cores)? > > > >>>>>>>>> > > > >>>>>>>>> On Feb 27, 2014, at 3:19 PM, tmish...@jcity.maeda.co.jp > wrote: > > > >>>>>>>>> > > > >>>>>>>>>> > > > >>>>>>>>>> Hi Ralph, I'm afraid to say your new "map-by obj" causes > > another > > > >>>>>>>> problem. > > > >>>>>>>>>> > > > >>>>>>>>>> I have overload message with this command line as shown > below: > > > >>>>>>>>>> > > > >>>>>>>>>> mpirun -np 8 -host node05,node06 -report-bindings -map-by > > > >>>>> socket:pe=2 > > > >>>>>>>>>> -display-map ~/mis/openmpi/d > > > >>>>>>>>>> emos/myprog > > > >>>>>>>>>> > > > >>>>>>>> > > > >>>>>>> > > > >>>>> > > > >>> > > > > > > > -------------------------------------------------------------------------- > > > >>>>>>>>>> A request was made to bind to that would result in binding > > more > > > >>>>>>>>>> processes than cpus on a resource: > > > >>>>>>>>>> > > > >>>>>>>>>> Bind to: CORE > > > >>>>>>>>>> Node: node05 > > > >>>>>>>>>> #processes: 2 > > > >>>>>>>>>> #cpus: 1 > > > >>>>>>>>>> > > > >>>>>>>>>> You can override this protection by adding the > > > > "overload-allowed" > > > >>>>>>>>>> option to your binding directive. > > > >>>>>>>>>> > > > >>>>>>>> > > > >>>>>>> > > > >>>>> > > > >>> > > > > > > > -------------------------------------------------------------------------- > > > >>>>>>>>>> > > > >>>>>>>>>> Then, I add "-bind-to core:overload-allowed" to see what > > > > happenes. > > > >>>>>>>>>> > > > >>>>>>>>>> mpirun -np 8 -host node05,node06 -report-bindings -map-by > > > >>>>> socket:pe=2 > > > >>>>>>>>>> -display-map -bind-to core:o > > > >>>>>>>>>> verload-allowed ~/mis/openmpi/demos/myprog > > > >>>>>>>>>> Data for JOB [14398,1] offset 0 > > > >>>>>>>>>> > > > >>>>>>>>>> ======================== JOB MAP > ======================== > > > >>>>>>>>>> > > > >>>>>>>>>> Data for node: node05 Num slots: 1 Max slots: 0 Num > > > > procs: > > > >>> 4 > > > >>>>>>>>>> Process OMPI jobid: [14398,1] App: 0 Process rank: 0 > > > >>>>>>>>>> Process OMPI jobid: [14398,1] App: 0 Process rank: 1 > > > >>>>>>>>>> Process OMPI jobid: [14398,1] App: 0 Process rank: 2 > > > >>>>>>>>>> Process OMPI jobid: [14398,1] App: 0 Process rank: 3 > > > >>>>>>>>>> > > > >>>>>>>>>> Data for node: node06 Num slots: 1 Max slots: 0 Num > > > > procs: > > > >>> 4 > > > >>>>>>>>>> Process OMPI jobid: [14398,1] App: 0 Process rank: 4 > > > >>>>>>>>>> Process OMPI jobid: [14398,1] App: 0 Process rank: 5 > > > >>>>>>>>>> Process OMPI jobid: [14398,1] App: 0 Process rank: 6 > > > >>>>>>>>>> Process OMPI jobid: [14398,1] App: 0 Process rank: 7 > > > >>>>>>>>>> > > > >>>>>>>>>> > ============================================================= > > > >>>>>>>>>> [node06.cluster:18443] MCW rank 6 bound to socket 0[core 0 > [hwt > > > >>> 0]], > > > >>>>>>>> socket > > > >>>>>>>>>> 0[core 1[hwt 0]]: [B/B/./.][./././.] > > > >>>>>>>>>> [node05.cluster:20901] MCW rank 2 bound to socket 0[core 0 > [hwt > > > >>> 0]], > > > >>>>>>>> socket > > > >>>>>>>>>> 0[core 1[hwt 0]]: [B/B/./.][./././.] > > > >>>>>>>>>> [node06.cluster:18443] MCW rank 7 bound to socket 0[core 2 > [hwt > > > >>> 0]], > > > >>>>>>>> socket > > > >>>>>>>>>> 0[core 3[hwt 0]]: [././B/B][./././.] > > > >>>>>>>>>> [node05.cluster:20901] MCW rank 3 bound to socket 0[core 2 > [hwt > > > >>> 0]], > > > >>>>>>>> socket > > > >>>>>>>>>> 0[core 3[hwt 0]]: [././B/B][./././.] > > > >>>>>>>>>> [node06.cluster:18443] MCW rank 4 bound to socket 0[core 0 > [hwt > > > >>> 0]], > > > >>>>>>>> socket > > > >>>>>>>>>> 0[core 1[hwt 0]]: [B/B/./.][./././.] > > > >>>>>>>>>> [node05.cluster:20901] MCW rank 0 bound to socket 0[core 0 > [hwt > > > >>> 0]], > > > >>>>>>>> socket > > > >>>>>>>>>> 0[core 1[hwt 0]]: [B/B/./.][./././.] > > > >>>>>>>>>> [node06.cluster:18443] MCW rank 5 bound to socket 0[core 2 > [hwt > > > >>> 0]], > > > >>>>>>>> socket > > > >>>>>>>>>> 0[core 3[hwt 0]]: [././B/B][./././.] > > > >>>>>>>>>> [node05.cluster:20901] MCW rank 1 bound to socket 0[core 2 > [hwt > > > >>> 0]],> > >>>>>>>> socket > > > >>>>>>>>>> 0[core 3[hwt 0]]: [././B/B][./././.] > > > >>>>>>>>>> Hello world from process 4 of 8 > > > >>>>>>>>>> Hello world from process 2 of 8 > > > >>>>>>>>>> Hello world from process 6 of 8 > > > >>>>>>>>>> Hello world from process 0 of 8 > > > >>>>>>>>>> Hello world from process 5 of 8 > > > >>>>>>>>>> Hello world from process 1 of 8 > > > >>>>>>>>>> Hello world from process 7 of 8 > > > >>>>>>>>>> Hello world from process 3 of 8 > > > >>>>>>>>>> > > > >>>>>>>>>> When I add "map-by obj:span", it works fine: > > > >>>>>>>>>> > > > >>>>>>>>>> mpirun -np 8 -host node05,node06 -report-bindings -map-by > > > >>>>>>>> socket:pe=2,span > > > >>>>>>>>>> -display-map ~/mis/ope > > > >>>>>>>>>> nmpi/demos/myprog > > > >>>>>>>>>> Data for JOB [14703,1] offset 0 > > > >>>>>>>>>> > > > >>>>>>>>>> ======================== JOB MAP > ======================== > > > >>>>>>>>>> > > > >>>>>>>>>> Data for node: node05 Num slots: 1 Max slots: 0 Num > > > > procs: > > > >>> 4 > > > >>>>>>>>>> Process OMPI jobid: [14703,1] App: 0 Process rank: 0 > > > >>>>>>>>>> Process OMPI jobid: [14703,1] App: 0 Process rank: 2 > > > >>>>>>>>>> Process OMPI jobid: [14703,1] App: 0 Process rank: 1 > > > >>>>>>>>>> Process OMPI jobid: [14703,1] App: 0 Process rank: 3 > > > >>>>>>>>>>> >>>>>>>>>> Data for node: node06 Num slots: 1 Max > slots: 0 Num > > > > procs: > > > >>> 4 > > > >>>>>>>>>> Process OMPI jobid: [14703,1] App: 0 Process rank: 4 > > > >>>>>>>>>> Process OMPI jobid: [14703,1] App: 0 Process rank: 6 > > > >>>>>>>>>> Process OMPI jobid: [14703,1] App: 0 Process rank: 5 > > > >>>>>>>>>> Process OMPI jobid: [14703,1] App: 0 Process rank: 7 > > > >>>>>>>>>> > > > >>>>>>>>>> > ============================================================= > > > >>>>>>>>>> [node06.cluster:18491] MCW rank 6 bound to socket 0[core 2 > [hwt > > > >>> 0]], > > > >>>>>>>> socket > > > >>>>>>>>>> 0[core 3[hwt 0]]: [././B/B][./././.] > > > >>>>>>>>>> [node05.cluster:20949] MCW rank 2 bound to socket 0[core 2 > [hwt > > > >>> 0]], > > > >>>>>>>> socket > > > >>>>>>>>>> 0[core 3[hwt 0]]: [././B/B][./././.] > > > >>>>>>>>>> [node06.cluster:18491] MCW rank 7 bound to socket 1[core 6 > [hwt > > > >>> 0]], > > > >>>>>>>> socket>>>>>>>>>> 1[core 7[hwt 0]]: [./././.][././B/B] > > > >>>>>>>>>> [node05.cluster:20949] MCW rank 3 bound to socket 1[core 6 > [hwt > > > >>> 0]], > > > >>>>>>>> socket > > > >>>>>>>>>> 1[core 7[hwt 0]]: [./././.][././B/B] > > > >>>>>>>>>> [node06.cluster:18491] MCW rank 4 bound to socket 0[core 0 > [hwt > > > >>> 0]], > > > >>>>>>>> socket > > > >>>>>>>>>> 0[core 1[hwt 0]]: [B/B/./.][./././.] > > > >>>>>>>>>> [node05.cluster:20949] MCW rank 0 bound to socket 0[core 0 > [hwt > > > >>> 0]], > > > >>>>>>>> socket > > > >>>>>>>>>> 0[core 1[hwt 0]]: [B/B/./.][./././.] > > > >>>>>>>>>> [node06.cluster:18491] MCW rank 5 bound to socket 1[core 4 > [hwt > > > >>> 0]], > > > >>>>>>>> socket > > > >>>>>>>>>> 1[core 5[hwt 0]]: [./././.][B/B/./.] > > > >>>>>>>>>> [node05.cluster:20949] MCW rank 1 bound to socket 1[core 4 > [hwt > > > >>> 0]], > > > >>>>>>>> socket > > > >>>>>>>>>> 1[core 5[hwt 0]]: [./././.][B/B/./.] > > > >>>>>>>>>> .... > > > >>>>>>>>>> > > > >>>>>>>>>> So, byobj_span would be okay. Of course, bynode and byslot > > > > should > > > >>> be > > > >>>>>>>> okay. > > > >>>>>>>>>> Could you take a look at orte_rmaps_rr_byobj again? > > > >>>>>>>>>> > > > >>>>>>>>>> Regards, > > > >>>>>>>>>> Tetsuya Mishima > > > >>>>>>>>>> > > > >>>>>>>>>> _______________________________________________ > > > >>>>>>>>>> users mailing list > > > >>>>>>>>>> us...@open-mpi.org > > > >>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users > > > >>>>>>>>> > > > >>>>>>>>> _______________________________________________ > > > >>>>>>>>> users mailing list > > > >>>>>>>>> us...@open-mpi.org > > > >>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users > > > >>>>>>>> > > > >>>>>>>> _______________________________________________ > > > >>>>>>>> users mailing list > > > >>>>>>>> us...@open-mpi.org > > > >>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users > > > >>>>>>> > > > >>>>>>> _______________________________________________ > > > >>>>>>> users mailing list > > > >>>>>>> us...@open-mpi.org>>>>> > > > > http://www.open-mpi.org/mailman/listinfo.cgi/users > > > >>>>>> > > > >>>>>> _______________________________________________ > > > >>>>>> users mailing list > > > >>>>>> us...@open-mpi.org > > > >>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users > > > >>>>> > > > >>>>> _______________________________________________ > > > >>>>> users mailing list > > > >>>>> us...@open-mpi.org > > > >>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users > > > >>>> > > > >>>> _______________________________________________ > > > >>>> users mailing list > > > >>>> us...@open-mpi.org > > > >>>> http://www.open-mpi.org/mailman/listinfo.cgi/users > > > >>> > > > >>> _______________________________________________ > > > >>> users mailing list > > > >>> us...@open-mpi.org > > > >>> http://www.open-mpi.org/mailman/listinfo.cgi/users > > > >> > > > >> _______________________________________________ > > > >> users mailing list > > > >> us...@open-mpi.org > > > >> http://www.open-mpi.org/mailman/listinfo.cgi/users > > > > > > > > _______________________________________________ > > > > users mailing list > > > > us...@open-mpi.org > > > > http://www.open-mpi.org/mailman/listinfo.cgi/users > > > > > > _______________________________________________ > > > users mailing list > > > us...@open-mpi.org > > > http://www.open-mpi.org/mailman/listinfo.cgi/users > > > > _______________________________________________ > > users mailing list > > us...@open-mpi.org > > http://www.open-mpi.org/mailman/listinfo.cgi/users > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users
patch_from_30895
Description: Binary data