Jeff, there are quite a lot of changes, I did not update master yet (need extra pairs of eyes to review this...) so unless you want to make rc2 today and rc3 a week later, it is imho way safer to wait for v1.10.2
Ralph, any thoughts ? Cheers, Gilles On Wednesday, October 7, 2015, Jeff Squyres (jsquyres) <jsquy...@cisco.com> wrote: > Is this something that needs to go into v1.10.1? > > If so, a PR needs to be filed ASAP. We were supposed to make the next > 1.10.1 RC yesterday, but slipped to today due to some last second patches. > > > > On Oct 7, 2015, at 4:32 AM, Gilles Gouaillardet <gil...@rist.or.jp > <javascript:;>> wrote: > > > > Marcin, > > > > here is a patch for the master, hopefully it fixes all the issues we > discussed > > i will make sure it applies fine vs latest 1.10 tarball from tomorrow > > > > Cheers, > > > > Gilles > > > > > > On 10/6/2015 7:22 PM, marcin.krotkiewski wrote: > >> Gilles, > >> > >> Yes, it seemed that all was fine with binding in the patched 1.10.1rc1 > - thank you. Eagerly waiting for the other patches, let me know and I will > test them later this week. > >> > >> Marcin > >> > >> > >> > >> On 10/06/2015 12:09 PM, Gilles Gouaillardet wrote: > >>> Marcin, > >>> > >>> my understanding is that in this case, patched v1.10.1rc1 is working > just fine. > >>> am I right ? > >>> > >>> I prepared two patches > >>> one to remove the warning when binding on one core if only one core is > available, > >>> an other one to add a warning if the user asks a binding policy that > makes no sense with the required mapping policy > >>> > >>> I will finalize them tomorrow hopefully > >>> > >>> Cheers, > >>> > >>> Gilles > >>> > >>> On Tuesday, October 6, 2015, marcin.krotkiewski < > marcin.krotkiew...@gmail.com <javascript:;>> wrote: > >>> Hi, Gilles > >>>> you mentionned you had one failure with 1.10.1rc1 and -bind-to core > >>>> could you please send the full details (script, allocation and output) > >>>> in your slurm script, you can do > >>>> srun -N $SLURM_NNODES -n $SLURM_NNODES --cpu_bind=none -l grep > Cpus_allowed_list /proc/self/status > >>>> before invoking mpirun > >>>> > >>> It was an interactive job allocated with > >>> > >>> salloc --account=staff --ntasks=32 --mem-per-cpu=2G --time=120:0:0 > >>> > >>> The slurm environment is the following > >>> > >>> SLURM_JOBID=12714491 > >>> SLURM_JOB_CPUS_PER_NODE='4,2,5(x2),4,7,5' > >>> SLURM_JOB_ID=12714491 > >>> SLURM_JOB_NODELIST='c1-[2,4,8,13,16,23,26]' > >>> SLURM_JOB_NUM_NODES=7 > >>> SLURM_JOB_PARTITION=normal > >>> SLURM_MEM_PER_CPU=2048 > >>> SLURM_NNODES=7 > >>> SLURM_NODELIST='c1-[2,4,8,13,16,23,26]' > >>> SLURM_NODE_ALIASES='(null)' > >>> SLURM_NPROCS=32 > >>> SLURM_NTASKS=32 > >>> SLURM_SUBMIT_DIR=/cluster/home/marcink > >>> SLURM_SUBMIT_HOST=login-0-1.local > >>> SLURM_TASKS_PER_NODE='4,2,5(x2),4,7,5' > >>> > >>> The output of the command you asked for is > >>> > >>> 0: c1-2.local Cpus_allowed_list: 1-4,17-20 > >>> 1: c1-4.local Cpus_allowed_list: 1,15,17,31 > >>> 2: c1-8.local Cpus_allowed_list: 0,5,9,13-14,16,21,25,29-30 > >>> 3: c1-13.local Cpus_allowed_list: 3-7,19-23 > >>> 4: c1-16.local Cpus_allowed_list: 12-15,28-31 > >>> 5: c1-23.local Cpus_allowed_list: 2-4,8,13-15,18-20,24,29-31 > >>> 6: c1-26.local Cpus_allowed_list: 1,6,11,13,15,17,22,27,29,31 > >>> > >>> Running with command > >>> > >>> mpirun --mca rmaps_base_verbose 10 --hetero-nodes --bind-to core > --report-bindings --map-by socket -np 32 ./affinity > >>> > >>> I have attached two output files: one for the original 1.10.1rc1, one > for the patched version. > >>> > >>> When I said 'failed in one case' I was not precise. I got an error on > node c1-8, which was the first one to have different number of MPI > processes on the two sockets. It would also fail on some later nodes, just > that because of the error we never got there. > >>> > >>> Let me know if you need more. > >>> > >>> Marcin > >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>>> Cheers, > >>>> > >>>> Gilles > >>>> > >>>> On 10/4/2015 11:55 PM, marcin.krotkiewski wrote: > >>>>> Hi, all, > >>>>> > >>>>> I played a bit more and it seems that the problem results from > >>>>> > >>>>> trg_obj = opal_hwloc_base_find_min_bound_target_under_obj() > >>>>> > >>>>> called in rmaps_base_binding.c / bind_downwards being wrong. I do > not know the reason, but I think I know when the problem happens (at least > on 1.10.1rc1). It seems that by default openmpi maps by socket. The error > happens when for a given compute node there is a different number of cores > used on each socket. Consider previously studied case (the debug outputs I > sent in last post). c1-8, which was source of error, has 5 mpi processes > assigned, and the cpuset is the following: > >>>>> > >>>>> 0, 5, 9, 13, 14, 16, 21, 25, 29, 30 > >>>>> > >>>>> Cores 0,5 are on socket 0, cores 9, 13, 14 are on socket 1. Binding > progresses correctly up to and including core 13 (see end of file > out.1.10.1rc2, before the error). That is 2 cores on socket 0, and 2 cores > on socket 1. Error is thrown when core 14 should be bound - extra core on > socket 1 with no corresponding core on socket 0. At that point the returned > trg_obj points to the first core on the node (os_index 0, socket 0). > >>>>> > >>>>> I have submitted a few other jobs and I always had an error in such > situation. Moreover, if I now use --map-by core instead of socket, the > error is gone, and I get my expected binding: > >>>>> > >>>>> rank 0 @ compute-1-2.local 1, 17, > >>>>> rank 1 @ compute-1-2.local 2, 18, > >>>>> rank 2 @ compute-1-2.local 3, 19, > >>>>> rank 3 @ compute-1-2.local 4, 20, > >>>>> rank 4 @ compute-1-4.local 1, 17, > >>>>> rank 5 @ compute-1-4.local 15, 31, > >>>>> rank 6 @ compute-1-8.local 0, 16, > >>>>> rank 7 @ compute-1-8.local 5, 21, > >>>>> rank 8 @ compute-1-8.local 9, 25, > >>>>> rank 9 @ compute-1-8.local 13, 29, > >>>>> rank 10 @ compute-1-8.local 14, 30, > >>>>> rank 11 @ compute-1-13.local 3, 19, > >>>>> rank 12 @ compute-1-13.local 4, 20, > >>>>> rank 13 @ compute-1-13.local 5, 21, > >>>>> rank 14 @ compute-1-13.local 6, 22, > >>>>> rank 15 @ compute-1-13.local 7, 23, > >>>>> rank 16 @ compute-1-16.local 12, 28, > >>>>> rank 17 @ compute-1-16.local 13, 29, > >>>>> rank 18 @ compute-1-16.local 14, 30, > >>>>> rank 19 @ compute-1-16.local 15, 31, > >>>>> rank 20 @ compute-1-23.local 2, 18, > >>>>> rank 29 @ compute-1-26.local 11, 27, > >>>>> rank 21 @ compute-1-23.local 3, 19, > >>>>> rank 30 @ compute-1-26.local 13, 29, > >>>>> rank 22 @ compute-1-23.local 4, 20, > >>>>> rank 31 @ compute-1-26.local 15, 31, > >>>>> rank 23 @ compute-1-23.local 8, 24, > >>>>> rank 27 @ compute-1-26.local 1, 17, > >>>>> rank 24 @ compute-1-23.local 13, 29, > >>>>> rank 28 @ compute-1-26.local 6, 22, > >>>>> rank 25 @ compute-1-23.local 14, 30, > >>>>> rank 26 @ compute-1-23.local 15, 31, > >>>>> > >>>>> Using --map-by core seems to fix the issue on 1.8.8, 1.10.0 and > 1.10.1rc1. However, there is still a difference in behavior between > 1.10.1rc1 and earlier versions. In the SLURM job described in last post, > 1.10.1rc1 fails to bind only in 1 case, while the earlier versions fail in > 21 out of 32 cases. You mentioned there was a bug in hwloc. Not sure if it > can explain the difference in behavior. > >>>>> > >>>>> Hope this helps to nail this down. > >>>>> > >>>>> Marcin > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> On 10/04/2015 09:55 AM, Gilles Gouaillardet wrote: > >>>>>> Ralph, > >>>>>> > >>>>>> I suspect ompi tries to bind to threads outside the cpuset. > >>>>>> this could be pretty similar to a previous issue when ompi tried to > bind to cores outside the cpuset. > >>>>>> /* when a core has more than one thread, would ompi assume all the > threads are available if the core is available ? */ > >>>>>> I will investigate this from tomorrow > >>>>>> > >>>>>> Cheers, > >>>>>> > >>>>>> Gilles > >>>>>> > >>>>>> On Sunday, October 4, 2015, Ralph Castain <r...@open-mpi.org > <javascript:;>> wrote: > >>>>>> Thanks - please go ahead and release that allocation as I’m not > going to get to this immediately. I’ve got several hot irons in the fire > right now, and I’m not sure when I’ll get a chance to track this down. > >>>>>> > >>>>>> Gilles or anyone else who might have time - feel free to take a > gander and see if something pops out at you. > >>>>>> > >>>>>> Ralph > >>>>>> > >>>>>> > >>>>>>> On Oct 3, 2015, at 11:05 AM, marcin.krotkiewski < > marcin.krotkiew...@gmail.com <javascript:;>> wrote: > >>>>>>> > >>>>>>> > >>>>>>> Done. I have compiled 1.10.0 and 1.10.rc1 with --enable-debug and > executed > >>>>>>> > >>>>>>> mpirun --mca rmaps_base_verbose 10 --hetero-nodes > --report-bindings --bind-to core -np 32 ./affinity > >>>>>>> > >>>>>>> In case of 1.10.rc1 I have also added :overload-allowed - output > in a separate file. This option did not make much difference for 1.10.0, so > I did not attach it here. > >>>>>>> > >>>>>>> First thing I noted for 1.10.0 are lines like > >>>>>>> > >>>>>>> [login-0-1.local:03399] [[37945,0],0] GOT 1 CPUS > >>>>>>> [login-0-1.local:03399] [[37945,0],0] PROC [[37945,1],27] BITMAP > >>>>>>> [login-0-1.local:03399] [[37945,0],0] PROC [[37945,1],27] ON c1-26 > IS NOT BOUND > >>>>>>> > >>>>>>> with an empty BITMAP. > >>>>>>> > >>>>>>> The SLURM environment is > >>>>>>> > >>>>>>> set | grep SLURM > >>>>>>> SLURM_JOBID=12714491 > >>>>>>> SLURM_JOB_CPUS_PER_NODE='4,2,5(x2),4,7,5' > >>>>>>> SLURM_JOB_ID=12714491 > >>>>>>> SLURM_JOB_NODELIST='c1-[2,4,8,13,16,23,26]' > >>>>>>> SLURM_JOB_NUM_NODES=7 > >>>>>>> SLURM_JOB_PARTITION=normal > >>>>>>> SLURM_MEM_PER_CPU=2048 > >>>>>>> SLURM_NNODES=7 > >>>>>>> SLURM_NODELIST='c1-[2,4,8,13,16,23,26]' > >>>>>>> SLURM_NODE_ALIASES='(null)' > >>>>>>> SLURM_NPROCS=32 > >>>>>>> SLURM_NTASKS=32 > >>>>>>> SLURM_SUBMIT_DIR=/cluster/home/marcink > >>>>>>> SLURM_SUBMIT_HOST=login-0-1.local > >>>>>>> SLURM_TASKS_PER_NODE='4,2,5(x2),4,7,5' > >>>>>>> > >>>>>>> I have submitted an interactive job on screen for 120 hours now to > work with one example, and not change it for every post :) > >>>>>>> > >>>>>>> If you need anything else, let me know. I could introduce some > patch/printfs and recompile, if you need it. > >>>>>>> > >>>>>>> Marcin > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> On 10/03/2015 07:17 PM, Ralph Castain wrote: > >>>>>>>> Rats - just realized I have no way to test this as none of the > machines I can access are setup for cgroup-based multi-tenant. Is this a > debug version of OMPI? If not, can you rebuild OMPI with —enable-debug? > >>>>>>>> > >>>>>>>> Then please run it with —mca rmaps_base_verbose 10 and pass along > the output. > >>>>>>>> > >>>>>>>> Thanks > >>>>>>>> Ralph > >>>>>>>> > >>>>>>>> > >>>>>>>>> On Oct 3, 2015, at 10:09 AM, Ralph Castain <r...@open-mpi.org > <javascript:;>> wrote: > >>>>>>>>> > >>>>>>>>> What version of slurm is this? I might try to debug it here. I’m > not sure where the problem lies just yet. > >>>>>>>>> > >>>>>>>>> > >>>>>>>>>> On Oct 3, 2015, at 8:59 AM, marcin.krotkiewski < > marcin.krotkiew...@gmail.com <javascript:;>> wrote: > >>>>>>>>>> > >>>>>>>>>> Here is the output of lstopo. In short, (0,16) are core 0, > (1,17) - core 1 etc. > >>>>>>>>>> > >>>>>>>>>> Machine (64GB) > >>>>>>>>>> NUMANode L#0 (P#0 32GB) > >>>>>>>>>> Socket L#0 + L3 L#0 (20MB) > >>>>>>>>>> L2 L#0 (256KB) + L1d L#0 (32KB) + L1i L#0 (32KB) + Core > L#0 > >>>>>>>>>> PU L#0 (P#0) > >>>>>>>>>> PU L#1 (P#16) > >>>>>>>>>> L2 L#1 (256KB) + L1d L#1 (32KB) + L1i L#1 (32KB) + Core > L#1 > >>>>>>>>>> PU L#2 (P#1) > >>>>>>>>>> PU L#3 (P#17) > >>>>>>>>>> L2 L#2 (256KB) + L1d L#2 (32KB) + L1i L#2 (32KB) + Core > L#2 > >>>>>>>>>> PU L#4 (P#2) > >>>>>>>>>> PU L#5 (P#18) > >>>>>>>>>> L2 L#3 (256KB) + L1d L#3 (32KB) + L1i L#3 (32KB) + Core > L#3 > >>>>>>>>>> PU L#6 (P#3) > >>>>>>>>>> PU L#7 (P#19) > >>>>>>>>>> L2 L#4 (256KB) + L1d L#4 (32KB) + L1i L#4 (32KB) + Core > L#4 > >>>>>>>>>> PU L#8 (P#4) > >>>>>>>>>> PU L#9 (P#20) > >>>>>>>>>> L2 L#5 (256KB) + L1d L#5 (32KB) + L1i L#5 (32KB) + Core > L#5 > >>>>>>>>>> PU L#10 (P#5) > >>>>>>>>>> PU L#11 (P#21) > >>>>>>>>>> L2 L#6 (256KB) + L1d L#6 (32KB) + L1i L#6 (32KB) + Core > L#6 > >>>>>>>>>> PU L#12 (P#6) > >>>>>>>>>> PU L#13 (P#22) > >>>>>>>>>> L2 L#7 (256KB) + L1d L#7 (32KB) + L1i L#7 (32KB) + Core > L#7 > >>>>>>>>>> PU L#14 (P#7) > >>>>>>>>>> PU L#15 (P#23) > >>>>>>>>>> HostBridge L#0 > >>>>>>>>>> PCIBridge > >>>>>>>>>> PCI 8086:1521 > >>>>>>>>>> Net L#0 "eth0" > >>>>>>>>>> PCI 8086:1521 > >>>>>>>>>> Net L#1 "eth1" > >>>>>>>>>> PCIBridge > >>>>>>>>>> PCI 15b3:1003 > >>>>>>>>>> Net L#2 "ib0" > >>>>>>>>>> OpenFabrics L#3 "mlx4_0" > >>>>>>>>>> PCIBridge > >>>>>>>>>> PCI 102b:0532 > >>>>>>>>>> PCI 8086:1d02 > >>>>>>>>>> Block L#4 "sda" > >>>>>>>>>> NUMANode L#1 (P#1 32GB) + Socket L#1 + L3 L#1 (20MB) > >>>>>>>>>> L2 L#8 (256KB) + L1d L#8 (32KB) + L1i L#8 (32KB) + Core L#8 > >>>>>>>>>> PU L#16 (P#8) > >>>>>>>>>> PU L#17 (P#24) > >>>>>>>>>> L2 L#9 (256KB) + L1d L#9 (32KB) + L1i L#9 (32KB) + Core L#9 > >>>>>>>>>> PU L#18 (P#9) > >>>>>>>>>> PU L#19 (P#25) > >>>>>>>>>> L2 L#10 (256KB) + L1d L#10 (32KB) + L1i L#10 (32KB) + Core > L#10 > >>>>>>>>>> PU L#20 (P#10) > >>>>>>>>>> PU L#21 (P#26) > >>>>>>>>>> L2 L#11 (256KB) + L1d L#11 (32KB) + L1i L#11 (32KB) + Core > L#11 > >>>>>>>>>> PU L#22 (P#11) > >>>>>>>>>> PU L#23 (P#27) > >>>>>>>>>> L2 L#12 (256KB) + L1d L#12 (32KB) + L1i L#12 (32KB) + Core > L#12 > >>>>>>>>>> PU L#24 (P#12) > >>>>>>>>>> PU L#25 (P#28) > >>>>>>>>>> L2 L#13 (256KB) + L1d L#13 (32KB) + L1i L#13 (32KB) + Core > L#13 > >>>>>>>>>> PU L#26 (P#13) > >>>>>>>>>> PU L#27 (P#29) > >>>>>>>>>> L2 L#14 (256KB) + L1d L#14 (32KB) + L1i L#14 (32KB) + Core > L#14 > >>>>>>>>>> PU L#28 (P#14) > >>>>>>>>>> PU L#29 (P#30) > >>>>>>>>>> L2 L#15 (256KB) + L1d L#15 (32KB) + L1i L#15 (32KB) + Core > L#15 > >>>>>>>>>> PU L#30 (P#15) > >>>>>>>>>> PU L#31 (P#31) > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> On 10/03/2015 05:46 PM, Ralph Castain wrote: > >>>>>>>>>>> Maybe I’m just misreading your HT map - that slurm nodelist > syntax is a new one to me, but they tend to change things around. Could you > run lstopo on one of those compute nodes and send the output? > >>>>>>>>>>> > >>>>>>>>>>> I’m just suspicious because I’m not seeing a clear pairing of > HT numbers in your output, but HT numbering is BIOS-specific and I may just > not be understanding your particular pattern. Our error message is clearly > indicating that we are seeing individual HTs (and not complete cores) > assigned, and I don’t know the source of that confusion. > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>>> On Oct 3, 2015, at 8:28 AM, marcin.krotkiewski < > marcin.krotkiew...@gmail.com <javascript:;>> wrote: > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> On 10/03/2015 04:38 PM, Ralph Castain wrote: > >>>>>>>>>>>>> If mpirun isn’t trying to do any binding, then you will of > course get the right mapping as we’ll just inherit whatever we received. > >>>>>>>>>>>> Yes. I meant that whatever you received (what SLURM gives) is > a correct cpu map and assigns _whole_ CPUs, not a single HT to MPI > processes. In the case mentioned earlier openmpi should start 6 tasks on > c1-30. If HT would be treated as separate and independent cores, > sched_getaffinity of an MPI process started on c1-30 would return a map > with 6 entries only. In my case it returns a map > with 12 entries - 2 for each core. So one > process is in fact allocated both HTs, not only one. Is what I'm saying > correct? > >>>>>>>>>>>> > >>>>>>>>>>>>> Looking at your output, it’s pretty clear that you are > getting independent HTs assigned and not full cores. > >>>>>>>>>>>> How do you mean? Is the above understanding wrong? I would > expect that on c1-30 with --bind-to core openmpi should bind to logical > cores 0 and 16 (rank 0), 1 and 17 (rank 2) and so on. All those logical > cores are available in sched_getaffinity map, and there is twice as many > logical cores as there are MPI processes started on the node. > >>>>>>>>>>>> > >>>>>>>>>>>>> My guess is that something in slurm has changed such that it > detects that HT has been enabled, and then begins treating the HTs as > completely independent cpus. > >>>>>>>>>>>>> > >>>>>>>>>>>>> Try changing “-bind-to core” to “-bind-to hwthread > -use-hwthread-cpus” and see if that works > >>>>>>>>>>>>> > >>>>>>>>>>>> I have and the binding is wrong. For example, I got this > output > >>>>>>>>>>>> > >>>>>>>>>>>> rank 0 @ compute-1-30.local 0, > >>>>>>>>>>>> rank 1 @ compute-1-30.local 16, > >>>>>>>>>>>> > >>>>>>>>>>>> Which means that two ranks have been bound to the same > physical core (logical cores 0 and 16 are two HTs of the same core). If I > use --bind-to core, I get the following correct binding > >>>>>>>>>>>> > >>>>>>>>>>>> rank 0 @ compute-1-30.local 0, 16, > >>>>>>>>>>>> > >>>>>>>>>>>> The problem is many other ranks get bad binding with 'rank > XXX is not bound (or bound to all available processors)' warning. > >>>>>>>>>>>> > >>>>>>>>>>>> But I think I was not entirely correct saying that 1.10.1rc1 > did not fix things. It still might have improved something, but not > everything. Consider this job: > >>>>>>>>>>>> > >>>>>>>>>>>> SLURM_JOB_CPUS_PER_NODE='5,4,6,5(x2),7,5,9,5,7,6' > >>>>>>>>>>>> SLURM_JOB_NODELIST='c8-[31,34],c9-[30-32,35-36],c10-[31-34]' > >>>>>>>>>>>> > >>>>>>>>>>>> If I run 32 tasks as follows (with 1.10.1rc1) > >>>>>>>>>>>> > >>>>>>>>>>>> mpirun --hetero-nodes --report-bindings --bind-to core -np 32 > ./affinity > >>>>>>>>>>>> > >>>>>>>>>>>> I get the following error: > >>>>>>>>>>>> > >>>>>>>>>>>> > -------------------------------------------------------------------------- > >>>>>>>>>>>> A request was made to bind to that would result in binding > more > >>>>>>>>>>>> processes than cpus on a resource: > >>>>>>>>>>>> > >>>>>>>>>>>> Bind to: CORE > >>>>>>>>>>>> Node: c9-31 > >>>>>>>>>>>> #processes: 2 > >>>>>>>>>>>> #cpus: 1 > >>>>>>>>>>>> > >>>>>>>>>>>> You can override this protection by adding the > "overload-allowed" > >>>>>>>>>>>> option to your binding directive. > >>>>>>>>>>>> > -------------------------------------------------------------------------- > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> If I now use --bind-to core:overload-allowed, then openmpi > starts and _most_ of the threads are bound correctly (i.e., map contains > two logical cores in ALL cases), except this case that required the > overload flag: > >>>>>>>>>>>> > >>>>>>>>>>>> rank 15 @ compute-9-31.local 1, 17, > >>>>>>>>>>>> rank 16 @ compute-9-31.local 11, 27, > >>>>>>>>>>>> rank 17 @ compute-9-31.local 2, 18, > >>>>>>>>>>>> rank 18 @ compute-9-31.local 12, 28, > >>>>>>>>>>>> rank 19 @ compute-9-31.local 1, 17, > >>>>>>>>>>>> > >>>>>>>>>>>> Note pair 1,17 is used twice. The original SLURM delivered > map (no binding) on this node is > >>>>>>>>>>>> > >>>>>>>>>>>> rank 15 @ compute-9-31.local 1, 2, 11, 12, 13, 17, 18, 27, > 28, 29, > >>>>>>>>>>>> rank 16 @ compute-9-31.local 1, 2, 11, 12, 13, 17, 18, 27, > 28, 29, > >>>>>>>>>>>> rank 17 @ compute-9-31.local 1, 2, 11, 12, 13, 17, 18, 27, > 28, 29, > >>>>>>>>>>>> rank 18 @ compute-9-31.local 1, 2, 11, 12, 13, 17, 18, 27, > 28, 29, > >>>>>>>>>>>> rank 19 @ compute-9-31.local 1, 2, 11, 12, 13, 17, 18, 27, > 28, 29, > >>>>>>>>>>>> > >>>>>>>>>>>> Why does openmpi use cores (1,17) twice instead of using core > (13,29)? Clearly, the original SLURM-delivered map has 5 CPUs included, > enough for 5 MPI processes. > >>>>>>>>>>>> > >>>>>>>>>>>> Cheers, > >>>>>>>>>>>> > >>>>>>>>>>>> Marcin > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>>> On Oct 3, 2015, at 7:12 AM, marcin.krotkiewski < > marcin.krotkiew...@gmail.com <javascript:;>> wrote: > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> On 10/03/2015 01:06 PM, Ralph Castain wrote: > >>>>>>>>>>>>>>> Thanks Marcin. Looking at this, I’m guessing that Slurm > may be treating HTs as “cores” - i.e., as independent cpus. Any chance that > is true? > >>>>>>>>>>>>>> Not to the best of my knowledge, and at least not > intentionally. SLURM starts as many processes as there are physical cores, > not threads. To verify this, consider this test case: > >>>>>> > >>>>>> > >>>>>> _______________________________________________ > >>>>>> users mailing list > >>>>>> > >>>>>> us...@open-mpi.org <javascript:;> > >>>>>> > >>>>>> Subscription: > >>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users > >>>>>> > >>>>>> Link to this post: > >>>>>> http://www.open-mpi.org/community/lists/users/2015/10/27790.php > >>>>> > >>>>> > >>>>> > >>>>> _______________________________________________ > >>>>> users mailing list > >>>>> > >>>>> us...@open-mpi.org <javascript:;> > >>>>> > >>>>> Subscription: > >>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users > >>>>> > >>>>> Link to this post: > >>>>> http://www.open-mpi.org/community/lists/users/2015/10/27791.php > >>>> > >>>> > >>>> > >>>> _______________________________________________ > >>>> users mailing list > >>>> > >>>> us...@open-mpi.org <javascript:;> > >>>> > >>>> Subscription: > >>>> http://www.open-mpi.org/mailman/listinfo.cgi/users > >>>> > >>>> Link to this post: > >>>> http://www.open-mpi.org/community/lists/users/2015/10/27792.php > >>> > >>> > >>> > >>> _______________________________________________ > >>> users mailing list > >>> > >>> us...@open-mpi.org <javascript:;> > >>> > >>> Subscription: > >>> http://www.open-mpi.org/mailman/listinfo.cgi/users > >>> > >>> Link to this post: > >>> http://www.open-mpi.org/community/lists/users/2015/10/27814.php > >> > >> > >> > >> _______________________________________________ > >> users mailing list > >> > >> us...@open-mpi.org <javascript:;> > >> > >> Subscription: > >> http://www.open-mpi.org/mailman/listinfo.cgi/users > >> > >> Link to this post: > >> http://www.open-mpi.org/community/lists/users/2015/10/27815.php > > > > > <heterogeneous_topologies.patch>_______________________________________________ > > users mailing list > > us...@open-mpi.org <javascript:;> > > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > > Link to this post: > http://www.open-mpi.org/community/lists/users/2015/10/27827.php > > > -- > Jeff Squyres > jsquy...@cisco.com <javascript:;> > For corporate legal information go to: > http://www.cisco.com/web/about/doing_business/legal/cri/ > > _______________________________________________ > users mailing list > us...@open-mpi.org <javascript:;> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > Link to this post: > http://www.open-mpi.org/community/lists/users/2015/10/27828.php >