No need for the lstopo data anymore, Bill - I was able to recreate the situation using some very nice hwloc functions plus your prior descriptions. I'm not totally confident that this fix will resolve the problem but it will clear out at least one problem.
We'll just have to see what happens and attack it next. Ralph On Tue, Jul 7, 2015 at 8:07 PM, Lane, William <william.l...@cshs.org> wrote: > I'm sorry I haven't been able to get the lstopo information for > all the nodes, but I had to get the latest version of hwloc installed > first. They've even added in some more modern blades that also > support hyperthreading, ugh. They've also been doing some memory > upgrades as well. > > I'm trying to get a Bash script running on the cluster via qsub > that will run lstopo and output the host information to a file located > in my $HOME directory but it hasn't been working (there are 60 nodes > in the heterogeneous cluster that needs to have OpenMPI running). > > I will try to get the lstopo information by the end of the week. > > I'd be willing to do most anything to get these OpenMPI issues > resolved. I'd even wash your cars for you! > > -Bill L. > ________________________________________ > From: users [users-boun...@open-mpi.org] on behalf of Ralph Castain [ > r...@open-mpi.org] > Sent: Tuesday, July 07, 2015 1:36 PM > To: Open MPI Users > Subject: Re: [OMPI users] OpenMPI 1.8.6, CentOS 6.3, too many slots = crash > > I may have finally tracked this down. At least, I can now get the correct > devel map to come out, and found a memory corruption issue that only > impacted hetero operations. I can’t know if this is the root cause of the > problem Bill is seeing, however, as I have no way of actually running the > job. > > I pushed this into the master and will bring it back to 1.8.7 as well as > 1.10. > > Bill - would you be able/willing to give it a try there? It would be nice > to confirm this actually fixed the problem. > > > > On Jun 29, 2015, at 1:58 PM, Jeff Squyres (jsquyres) <jsquy...@cisco.com> > wrote: > > > > lstopo will tell you -- if there is more than one "PU" (hwloc > terminology for "processing unit") per core, then hyper threading is > enabled. If there's only one PU per core, then hyper threading is disabled. > > > > > >> On Jun 29, 2015, at 4:42 PM, Lane, William <william.l...@cshs.org> > wrote: > >> > >> Would the output of dmidecode -t processor and/or lstopo tell me > conclusively > >> if hyperthreading is enabled or not? Hyperthreading is supposed to be > enabled > >> for all the IBM x3550 M3 and M4 nodes, but I'm not sure if it actually > is and I > >> don't have access to the BIOS settings. > >> > >> -Bill L. > >> > >> From: users [users-boun...@open-mpi.org] on behalf of Ralph Castain [ > r...@open-mpi.org] > >> Sent: Saturday, June 27, 2015 7:21 PM > >> To: Open MPI Users > >> Subject: Re: [OMPI users] OpenMPI 1.8.6, CentOS 6.3, too many slots = > crash > >> > >> Bill - this is such a jumbled collection of machines that I’m having > trouble figuring out what I should replicate. I can create something > artificial here so I can try to debug this, but I need to know exactly what > I’m up against - can you tell me: > >> > >> * the architecture of each type - how many sockets, how many > cores/socket, HT on or off. If two nodes have the same physical setup but > one has HT on and the other off, then please consider those two different > types > >> > >> * how many nodes of each type > >> > >> Looking at your map output, it looks like the map is being done > correctly, but somehow the binding locale isn’t getting set in some cases. > You latest error output would seem out-of-step with your prior reports, so > something else may be going on there. As I said earlier, this is the most > hetero environment we’ve seen, and so there may be some code paths your > hitting that haven’t been well exercised before. > >> > >> > >> > >> > >>> On Jun 26, 2015, at 5:22 PM, Lane, William <william.l...@cshs.org> > wrote: > >>> > >>> Well, I managed to get a successful mpirun @ a slot count of 132 using > --mca btl ^sm, > >>> however when I increased the slot count to 160, mpirun crashed without > any error > >>> output: > >>> > >>> mpirun -np 160 -display-devel-map --prefix > /hpc/apps/mpi/openmpi/1.8.6/ --hostfile hostfile-noslots --mca btl ^sm > --hetero-nodes --bind-to core /hpc/home/lanew/mpi/openmpi/ProcessColors3 >> > out.txt 2>&1 > >>> > >>> > -------------------------------------------------------------------------- > >>> WARNING: a request was made to bind a process. While the system > >>> supports binding the process itself, at least one node does NOT > >>> support binding memory to the process location. > >>> > >>> Node: csclprd3-6-1 > >>> > >>> This usually is due to not having the required NUMA support installed > >>> on the node. In some Linux distributions, the required support is > >>> contained in the libnumactl and libnumactl-devel packages. > >>> This is a warning only; your job will continue, though performance may > be degraded. > >>> > -------------------------------------------------------------------------- > >>> > -------------------------------------------------------------------------- > >>> A request was made to bind to that would result in binding more > >>> processes than cpus on a resource: > >>> > >>> Bind to: CORE > >>> Node: csclprd3-6-1 > >>> #processes: 2 > >>> #cpus: 1 > >>> > >>> You can override this protection by adding the "overload-allowed" > >>> option to your binding directive. > >>> > -------------------------------------------------------------------------- > >>> > >>> But csclprd3-6-1 (a blade) does have 2 CPU's on 2 separate sockets w/2 > cores apiece as shown in my dmidecode output: > >>> > >>> csclprd3-6-1 ~]# dmidecode -t processor > >>> # dmidecode 2.11 > >>> SMBIOS 2.4 present. > >>> > >>> Handle 0x0008, DMI type 4, 32 bytes > >>> Processor Information > >>> Socket Designation: Socket 1 CPU 1 > >>> Type: Central Processor > >>> Family: Xeon > >>> Manufacturer: GenuineIntel > >>> ID: F6 06 00 00 01 03 00 00 > >>> Signature: Type 0, Family 6, Model 15, Stepping 6 > >>> Flags: > >>> FPU (Floating-point unit on-chip) > >>> CX8 (CMPXCHG8 instruction supported) > >>> APIC (On-chip APIC hardware supported) > >>> Version: Intel Xeon > >>> Voltage: 2.9 V > >>> External Clock: 333 MHz > >>> Max Speed: 4000 MHz > >>> Current Speed: 3000 MHz > >>> Status: Populated, Enabled > >>> Upgrade: ZIF Socket > >>> L1 Cache Handle: 0x0004 > >>> L2 Cache Handle: 0x0005 > >>> L3 Cache Handle: Not Provided > >>> > >>> Handle 0x0009, DMI type 4, 32 bytes > >>> Processor Information > >>> Socket Designation: Socket 2 CPU 2 > >>> Type: Central Processor > >>> Family: Xeon > >>> Manufacturer: GenuineIntel > >>> ID: F6 06 00 00 01 03 00 00 > >>> Signature: Type 0, Family 6, Model 15, Stepping 6 > >>> Flags: > >>> FPU (Floating-point unit on-chip) > >>> CX8 (CMPXCHG8 instruction supported) > >>> APIC (On-chip APIC hardware supported) > >>> Version: Intel Xeon > >>> Voltage: 2.9 V > >>> External Clock: 333 MHz > >>> Max Speed: 4000 MHz > >>> Current Speed: 3000 MHz > >>> Status: Populated, Enabled > >>> Upgrade: ZIF Socket > >>> L1 Cache Handle: 0x0006 > >>> L2 Cache Handle: 0x0007 > >>> L3 Cache Handle: Not Provided > >>> > >>> csclprd3-6-1 ~]# lstopo > >>> Machine (16GB) > >>> Socket L#0 + L2 L#0 (4096KB) > >>> L1d L#0 (32KB) + L1i L#0 (32KB) + Core L#0 + PU L#0 (P#0) > >>> L1d L#1 (32KB) + L1i L#1 (32KB) + Core L#1 + PU L#1 (P#2) > >>> Socket L#1 + L2 L#1 (4096KB) > >>> L1d L#2 (32KB) + L1i L#2 (32KB) + Core L#2 + PU L#2 (P#1) > >>> L1d L#3 (32KB) + L1i L#3 (32KB) + Core L#3 + PU L#3 (P#3) > >>> > >>> csclprd3-0-1 information (which looks correct as this particular x3550 > should > >>> have one socket populated (of two) with a 6 core Xeon (or 12 cores > w/hyperthreading > >>> turned on): > >>> > >>> csclprd3-0-1 ~]# lstopo > >>> Machine (71GB) > >>> Socket L#0 + L3 L#0 (12MB) > >>> L2 L#0 (256KB) + L1d L#0 (32KB) + L1i L#0 (32KB) + Core L#0 + > PU L#0 (P#0) > >>> L2 L#1 (256KB) + L1d L#1 (32KB) + L1i L#1 (32KB) + Core L#1 + > PU L#1 (P#1) > >>> L2 L#2 (256KB) + L1d L#2 (32KB) + L1i L#2 (32KB) + Core L#2 + > PU L#2 (P#2) > >>> L2 L#3 (256KB) + L1d L#3 (32KB) + L1i L#3 (32KB) + Core L#3 + > PU L#3 (P#3) > >>> L2 L#4 (256KB) + L1d L#4 (32KB) + L1i L#4 (32KB) + Core L#4 + > PU L#4 (P#4) > >>> L2 L#5 (256KB) + L1d L#5 (32KB) + L1i L#5 (32KB) + Core L#5 + > PU L#5 (P#5) > >>> > >>> csclprd3-0-1 ~]# dmidecode -t processor > >>> # dmidecode 2.11 > >>> # SMBIOS entry point at 0x7f6be000 > >>> SMBIOS 2.5 present. > >>> > >>> Handle 0x0001, DMI type 4, 40 bytes > >>> Processor Information > >>> Socket Designation: Node 1 Socket 1 > >>> Type: Central Processor > >>> Family: Xeon MP > >>> Manufacturer: Intel(R) Corporation > >>> ID: C2 06 02 00 FF FB EB BF > >>> Signature: Type 0, Family 6, Model 44, Stepping 2 > >>> Flags: > >>> FPU (Floating-point unit on-chip) > >>> VME (Virtual mode extension) > >>> DE (Debugging extension) > >>> PSE (Page size extension) > >>> TSC (Time stamp counter) > >>> MSR (Model specific registers) > >>> PAE (Physical address extension) > >>> MCE (Machine check exception) > >>> CX8 (CMPXCHG8 instruction supported) > >>> APIC (On-chip APIC hardware supported) > >>> SEP (Fast system call) > >>> MTRR (Memory type range registers) > >>> PGE (Page global enable) > >>> MCA (Machine check architecture) > >>> CMOV (Conditional move instruction supported) > >>> PAT (Page attribute table) > >>> PSE-36 (36-bit page size extension) > >>> CLFSH (CLFLUSH instruction supported) > >>> DS (Debug store) > >>> ACPI (ACPI supported) > >>> MMX (MMX technology supported) > >>> FXSR (FXSAVE and FXSTOR instructions supported) > >>> SSE (Streaming SIMD extensions) > >>> SSE2 (Streaming SIMD extensions 2) > >>> SS (Self-snoop) > >>> HTT (Multi-threading) > >>> TM (Thermal monitor supported) > >>> PBE (Pending break enabled) > >>> Version: Intel(R) Xeon(R) CPU E5645 @ 2.40GHz > >>> Voltage: 1.2 V > >>> External Clock: 5866 MHz > >>> Max Speed: 4400 MHz > >>> Current Speed: 2400 MHz > >>> Status: Populated, Enabled > >>> Upgrade: ZIF Socket > >>> L1 Cache Handle: 0x0002 > >>> L2 Cache Handle: 0x0003 > >>> L3 Cache Handle: 0x0004 > >>> Serial Number: Not Specified > >>> Asset Tag: Not Specified > >>> Part Number: Not Specified > >>> Core Count: 6 > >>> Core Enabled: 6 > >>> Thread Count: 6 > >>> Characteristics: > >>> 64-bit capable > >>> > >>> Handle 0x005A, DMI type 4, 40 bytes > >>> Processor Information > >>> Socket Designation: Node 1 Socket 2 > >>> Type: Central Processor > >>> Family: Xeon MP > >>> Manufacturer: Not Specified > >>> ID: 00 00 00 00 00 00 00 00 > >>> Signature: Type 0, Family 0, Model 0, Stepping 0 > >>> Flags: None > >>> Version: Not Specified > >>> Voltage: 1.2 V > >>> External Clock: 5866 MHz > >>> Max Speed: 4400 MHz > >>> Current Speed: Unknown > >>> Status: Unpopulated > >>> Upgrade: ZIF Socket > >>> L1 Cache Handle: Not Provided > >>> L2 Cache Handle: Not Provided > >>> L3 Cache Handle: Not Provided > >>> Serial Number: Not Specified > >>> Asset Tag: Not Specified > >>> Part Number: Not Specified > >>> Characteristics: None > >>> > >>> > >>> From: users [users-boun...@open-mpi.org] on behalf of Ralph Castain [ > r...@open-mpi.org] > >>> Sent: Wednesday, June 24, 2015 6:06 AM > >>> To: Open MPI Users > >>> Subject: Re: [OMPI users] OpenMPI 1.8.6, CentOS 6.3, too many slots = > crash > >>> > >>> I think trying with --mca btl ^sm makes a lot of sense and may solve > the problem. I also noted that we are having trouble with the topology of > several of the nodes - seeing only one socket, non-HT where you say we > should see two sockets and HT-enabled. In those cases, the locality is > "unknown" - given that those procs are on remote nodes from the one being > impacted, I don't think it should cause a problem. However, it isn't > correct, and that raises flags. > >>> > >>> My best guess of the root cause of that error is either we are getting > bad topology info on those nodes, or we have a bug that is mishandling this > scenario. It would probably be good to get this error fixed to ensure it > isn't the source of the eventual crash, even though I'm not sure they are > related. > >>> > >>> Bill: Can we examine one of the problem nodes? Let's pick csclprd3-0-1 > (or take another one from your list - just look for one where "locality" is > reported as "unknown" for the procs in the output map). Can you run lstopo > on that node and send us the output? In the above map, it is reporting a > single socket with 6 cores, non-HT. Is that what lstopo shows when run on > the node? Is it what you expected? > >>> > >>> > >>> On Wed, Jun 24, 2015 at 4:07 AM, Gilles Gouaillardet < > gilles.gouaillar...@gmail.com> wrote: > >>> Bill, > >>> > >>> were you able to get a core file and analyze the stack with gdb ? > >>> > >>> I suspect the error occurs in mca_btl_sm_add_procs but this is just my > best guess. > >>> if this is correct, can you check the value of > mca_btl_sm_component.num_smp_procs ? > >>> > >>> as a workaround, can you try > >>> mpirun --mca btl ^sm ... > >>> > >>> I do not see how I can tackle the root cause without being able to > reproduce the issue :-( > >>> > >>> can you try to reproduce the issue with the smallest hostfile, and > then run lstopo on all the nodes ? > >>> btw, you are not mixing 32 bits and 64 bits OS, are you ? > >>> > >>> Cheers, > >>> > >>> Gilles > >>> > >>> > >>> > >>> > >>> mca_btl_sm_add_procs( > >>> > >>> > >>> > >>> int > >>> > >>> mca_btl_sm_add_procs > >>> ( > >>> On Wednesday, June 24, 2015, Lane, William <william.l...@cshs.org> > wrote: > >>> Gilles, > >>> > >>> All the blades only have two core Xeons (without hyperthreading) > populating both their sockets. All > >>> the x3550 nodes have hyperthreading capable Xeons and Sandybridge > server CPU's. It's possible > >>> hyperthreading has been disabled on some of these nodes though. The > 3-0-n nodes are all IBM x3550 > >>> nodes while the 3-6-n nodes are all blade nodes. > >>> > >>> I have run this exact same test code successfully in the past on > another cluster (~200 nodes of Sunfire X2100 > >>> 2x dual-core Opterons) w/no issues on upwards of 390 slots. I even > tested it recently on OpenMPI 1.8.5 > >>> on another smaller R&D cluster consisting of 10 Sunfire X2100 nodes > (w/2 dual core Opterons apiece). > >>> On this particular cluster I've had success running this code on < 132 > slots. > >>> > >>> Anyway, here's the results of the following mpirun: > >>> > >>> mpirun -np 132 -display-devel-map --prefix > /hpc/apps/mpi/openmpi/1.8.6/ --hostfile hostfile-noslots --mca > btl_tcp_if_include eth0 --hetero-nodes --bind-to core > /hpc/home/lanew/mpi/openmpi/ProcessColors3 >> out.txt 2>&1 > >>> > >>> > -------------------------------------------------------------------------- > >>> WARNING: a request was made to bind a process. While the system > >>> supports binding the process itself, at least one node does NOT > >>> support binding memory to the process location. > >>> > >>> Node: csclprd3-6-1 > >>> > >>> This usually is due to not having the required NUMA support installed > >>> on the node. In some Linux distributions, the required support is > >>> contained in the libnumactl and libnumactl-devel packages. > >>> This is a warning only; your job will continue, though performance may > be degraded. > >>> > -------------------------------------------------------------------------- > >>> Data for JOB [51718,1] offset 0 > >>> > >>> Mapper requested: NULL Last mapper: round_robin Mapping policy: > BYSOCKET Ranking policy: SLOT > >>> Binding policy: CORE Cpu set: NULL PPR: NULL Cpus-per-rank: 1 > >>> Num new daemons: 0 New daemon starting vpid INVALID > >>> Num nodes: 15 > >>> > >>> Data for node: csclprd3-6-1 Launch id: -1 State: 0 > >>> Daemon: [[51718,0],1] Daemon launched: True > >>> Num slots: 4 Slots in use: 4 Oversubscribed: FALSE > >>> Num slots allocated: 4 Max slots: 0 > >>> Username on node: NULL > >>> Num procs: 4 Next node_rank: 4 > >>> Data for proc: [[51718,1],0] > >>> Pid: 0 Local rank: 0 Node rank: 0 App rank: 0 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [B/B][./.] > >>> Binding: [B/.][./.] > >>> Data for proc: [[51718,1],1] > >>> Pid: 0 Local rank: 1 Node rank: 1 App rank: 1 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [./.][B/B] > >>> Binding: [./.][B/.] > >>> Data for proc: [[51718,1],2] > >>> Pid: 0 Local rank: 2 Node rank: 2 App rank: 2 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [B/B][./.] > >>> Binding: [./B][./.] > >>> Data for proc: [[51718,1],3] > >>> Pid: 0 Local rank: 3 Node rank: 3 App rank: 3 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [./.][B/B] > >>> Binding: [./.][./B] > >>> > >>> Data for node: csclprd3-6-5 Launch id: -1 State: 0 > >>> Daemon: [[51718,0],2] Daemon launched: True > >>> Num slots: 4 Slots in use: 4 Oversubscribed: FALSE > >>> Num slots allocated: 4 Max slots: 0 > >>> Username on node: NULL > >>> Num procs: 4 Next node_rank: 4 > >>> Data for proc: [[51718,1],4] > >>> Pid: 0 Local rank: 0 Node rank: 0 App rank: 4 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [B/B][./.] > >>> Binding: [B/.][./.] > >>> Data for proc: [[51718,1],5] > >>> Pid: 0 Local rank: 1 Node rank: 1 App rank: 5 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [./.][B/B] > >>> Binding: [./.][B/.] > >>> Data for proc: [[51718,1],6] > >>> Pid: 0 Local rank: 2 Node rank: 2 App rank: 6 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [B/B][./.] > >>> Binding: [./B][./.] > >>> Data for proc: [[51718,1],7] > >>> Pid: 0 Local rank: 3 Node rank: 3 App rank: 7 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [./.][B/B] > >>> Binding: [./.][./B] > >>> > >>> Data for node: csclprd3-0-0 Launch id: -1 State: 0 > >>> Daemon: [[51718,0],3] Daemon launched: True > >>> Num slots: 12 Slots in use: 12 Oversubscribed: FALSE > >>> Num slots allocated: 12 Max slots: 0 > >>> Username on node: NULL > >>> Num procs: 12 Next node_rank: 12 > >>> Data for proc: [[51718,1],8] > >>> Pid: 0 Local rank: 0 Node rank: 0 App rank: 8 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [B/B/B/B/B/B][./././././.] > >>> Binding: [B/././././.][./././././.] > >>> Data for proc: [[51718,1],9] > >>> Pid: 0 Local rank: 1 Node rank: 1 App rank: 9 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [./././././.][B/B/B/B/B/B] > >>> Binding: [./././././.][B/././././.] > >>> Data for proc: [[51718,1],10] > >>> Pid: 0 Local rank: 2 Node rank: 2 App rank: 10 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [B/B/B/B/B/B][./././././.] > >>> Binding: [./B/./././.][./././././.] > >>> Data for proc: [[51718,1],11] > >>> Pid: 0 Local rank: 3 Node rank: 3 App rank: 11 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [./././././.][B/B/B/B/B/B] > >>> Binding: [./././././.][./B/./././.] > >>> Data for proc: [[51718,1],12] > >>> Pid: 0 Local rank: 4 Node rank: 4 App rank: 12 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [B/B/B/B/B/B][./././././.] > >>> Binding: [././B/././.][./././././.] > >>> Data for proc: [[51718,1],13] > >>> Pid: 0 Local rank: 5 Node rank: 5 App rank: 13 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [./././././.][B/B/B/B/B/B] > >>> Binding: [./././././.][././B/././.] > >>> Data for proc: [[51718,1],14] > >>> Pid: 0 Local rank: 6 Node rank: 6 App rank: 14 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [B/B/B/B/B/B][./././././.] > >>> Binding: [./././B/./.][./././././.] > >>> Data for proc: [[51718,1],15] > >>> Pid: 0 Local rank: 7 Node rank: 7 App rank: 15 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [./././././.][B/B/B/B/B/B] > >>> Binding: [./././././.][./././B/./.] > >>> Data for proc: [[51718,1],16] > >>> Pid: 0 Local rank: 8 Node rank: 8 App rank: 16 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [B/B/B/B/B/B][./././././.] > >>> Binding: [././././B/.][./././././.] > >>> Data for proc: [[51718,1],17] > >>> Pid: 0 Local rank: 9 Node rank: 9 App rank: 17 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [./././././.][B/B/B/B/B/B] > >>> Binding: [./././././.][././././B/.] > >>> Data for proc: [[51718,1],18] > >>> Pid: 0 Local rank: 10 Node rank: 10 App rank: 18 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [B/B/B/B/B/B][./././././.] > >>> Binding: [./././././B][./././././.] > >>> Data for proc: [[51718,1],19] > >>> Pid: 0 Local rank: 11 Node rank: 11 App rank: 19 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [./././././.][B/B/B/B/B/B] > >>> Binding: [./././././.][./././././B] > >>> > >>> Data for node: csclprd3-0-1 Launch id: -1 State: 0 > >>> Daemon: [[51718,0],4] Daemon launched: True > >>> Num slots: 6 Slots in use: 6 Oversubscribed: FALSE > >>> Num slots allocated: 6 Max slots: 0 > >>> Username on node: NULL > >>> Num procs: 6 Next node_rank: 6 > >>> Data for proc: [[51718,1],20] > >>> Pid: 0 Local rank: 0 Node rank: 0 App rank: 20 > >>> State: INITIALIZED App_context: 0 > >>> Locale: UNKNOWN > >>> Binding: [B/././././.] > >>> Data for proc: [[51718,1],21] > >>> Pid: 0 Local rank: 1 Node rank: 1 App rank: 21 > >>> State: INITIALIZED App_context: 0 > >>> Locale: UNKNOWN > >>> Binding: [./B/./././.] > >>> Data for proc: [[51718,1],22] > >>> Pid: 0 Local rank: 2 Node rank: 2 App rank: 22 > >>> State: INITIALIZED App_context: 0 > >>> Locale: UNKNOWN > >>> Binding: [././B/././.] > >>> Data for proc: [[51718,1],23] > >>> Pid: 0 Local rank: 3 Node rank: 3 App rank: 23 > >>> State: INITIALIZED App_context: 0 > >>> Locale: UNKNOWN > >>> Binding: [./././B/./.] > >>> Data for proc: [[51718,1],24] > >>> Pid: 0 Local rank: 4 Node rank: 4 App rank: 24 > >>> State: INITIALIZED App_context: 0 > >>> Locale: UNKNOWN > >>> Binding: [././././B/.] > >>> Data for proc: [[51718,1],25] > >>> Pid: 0 Local rank: 5 Node rank: 5 App rank: 25 > >>> State: INITIALIZED App_context: 0 > >>> Locale: UNKNOWN > >>> Binding: [./././././B] > >>> > >>> Data for node: csclprd3-0-2 Launch id: -1 State: 0 > >>> Daemon: [[51718,0],5] Daemon launched: True > >>> Num slots: 6 Slots in use: 6 Oversubscribed: FALSE > >>> Num slots allocated: 6 Max slots: 0 > >>> Username on node: NULL > >>> Num procs: 6 Next node_rank: 6 > >>> Data for proc: [[51718,1],26] > >>> Pid: 0 Local rank: 0 Node rank: 0 App rank: 26 > >>> State: INITIALIZED App_context: 0 > >>> Locale: UNKNOWN > >>> Binding: [B/././././.] > >>> Data for proc: [[51718,1],27] > >>> Pid: 0 Local rank: 1 Node rank: 1 App rank: 27 > >>> State: INITIALIZED App_context: 0 > >>> Locale: UNKNOWN > >>> Binding: [./B/./././.] > >>> Data for proc: [[51718,1],28] > >>> Pid: 0 Local rank: 2 Node rank: 2 App rank: 28 > >>> State: INITIALIZED App_context: 0 > >>> Locale: UNKNOWN > >>> Binding: [././B/././.] > >>> Data for proc: [[51718,1],29] > >>> Pid: 0 Local rank: 3 Node rank: 3 App rank: 29 > >>> State: INITIALIZED App_context: 0 > >>> Locale: UNKNOWN > >>> Binding: [./././B/./.] > >>> Data for proc: [[51718,1],30] > >>> Pid: 0 Local rank: 4 Node rank: 4 App rank: 30 > >>> State: INITIALIZED App_context: 0 > >>> Locale: UNKNOWN > >>> Binding: [././././B/.] > >>> Data for proc: [[51718,1],31] > >>> Pid: 0 Local rank: 5 Node rank: 5 App rank: 31 > >>> State: INITIALIZED App_context: 0 > >>> Locale: UNKNOWN > >>> Binding: [./././././B] > >>> > >>> Data for node: csclprd3-0-3 Launch id: -1 State: 0 > >>> Daemon: [[51718,0],6] Daemon launched: True > >>> Num slots: 6 Slots in use: 6 Oversubscribed: FALSE > >>> Num slots allocated: 6 Max slots: 0 > >>> Username on node: NULL > >>> Num procs: 6 Next node_rank: 6 > >>> Data for proc: [[51718,1],32] > >>> Pid: 0 Local rank: 0 Node rank: 0 App rank: 32 > >>> State: INITIALIZED App_context: 0 > >>> Locale: UNKNOWN > >>> Binding: [B/././././.] > >>> Data for proc: [[51718,1],33] > >>> Pid: 0 Local rank: 1 Node rank: 1 App rank: 33 > >>> State: INITIALIZED App_context: 0 > >>> Locale: UNKNOWN > >>> Binding: [./B/./././.] > >>> Data for proc: [[51718,1],34] > >>> Pid: 0 Local rank: 2 Node rank: 2 App rank: 34 > >>> State: INITIALIZED App_context: 0 > >>> Locale: UNKNOWN > >>> Binding: [././B/././.] > >>> Data for proc: [[51718,1],35] > >>> Pid: 0 Local rank: 3 Node rank: 3 App rank: 35 > >>> State: INITIALIZED App_context: 0 > >>> Locale: UNKNOWN > >>> Binding: [./././B/./.] > >>> Data for proc: [[51718,1],36] > >>> Pid: 0 Local rank: 4 Node rank: 4 App rank: 36 > >>> State: INITIALIZED App_context: 0 > >>> Locale: UNKNOWN > >>> Binding: [././././B/.] > >>> Data for proc: [[51718,1],37] > >>> Pid: 0 Local rank: 5 Node rank: 5 App rank: 37 > >>> State: INITIALIZED App_context: 0 > >>> Locale: UNKNOWN > >>> Binding: [./././././B] > >>> > >>> Data for node: csclprd3-0-4 Launch id: -1 State: 0 > >>> Daemon: [[51718,0],7] Daemon launched: True > >>> Num slots: 6 Slots in use: 6 Oversubscribed: FALSE > >>> Num slots allocated: 6 Max slots: 0 > >>> Username on node: NULL > >>> Num procs: 6 Next node_rank: 6 > >>> Data for proc: [[51718,1],38] > >>> Pid: 0 Local rank: 0 Node rank: 0 App rank: 38 > >>> State: INITIALIZED App_context: 0 > >>> Locale: UNKNOWN > >>> Binding: [B/././././.] > >>> Data for proc: [[51718,1],39] > >>> Pid: 0 Local rank: 1 Node rank: 1 App rank: 39 > >>> State: INITIALIZED App_context: 0 > >>> Locale: UNKNOWN > >>> Binding: [./B/./././.] > >>> Data for proc: [[51718,1],40] > >>> Pid: 0 Local rank: 2 Node rank: 2 App rank: 40 > >>> State: INITIALIZED App_context: 0 > >>> Locale: UNKNOWN > >>> Binding: [././B/././.] > >>> Data for proc: [[51718,1],41] > >>> Pid: 0 Local rank: 3 Node rank: 3 App rank: 41 > >>> State: INITIALIZED App_context: 0 > >>> Locale: UNKNOWN > >>> Binding: [./././B/./.] > >>> Data for proc: [[51718,1],42] > >>> Pid: 0 Local rank: 4 Node rank: 4 App rank: 42 > >>> State: INITIALIZED App_context: 0 > >>> Locale: UNKNOWN > >>> Binding: [././././B/.] > >>> Data for proc: [[51718,1],43] > >>> Pid: 0 Local rank: 5 Node rank: 5 App rank: 43 > >>> State: INITIALIZED App_context: 0 > >>> Locale: UNKNOWN > >>> Binding: [./././././B] > >>> > >>> Data for node: csclprd3-0-5 Launch id: -1 State: 0 > >>> Daemon: [[51718,0],8] Daemon launched: True > >>> Num slots: 6 Slots in use: 6 Oversubscribed: FALSE > >>> Num slots allocated: 6 Max slots: 0 > >>> Username on node: NULL > >>> Num procs: 6 Next node_rank: 6 > >>> Data for proc: [[51718,1],44] > >>> Pid: 0 Local rank: 0 Node rank: 0 App rank: 44 > >>> State: INITIALIZED App_context: 0 > >>> Locale: UNKNOWN > >>> Binding: [B/././././.] > >>> Data for proc: [[51718,1],45] > >>> Pid: 0 Local rank: 1 Node rank: 1 App rank: 45 > >>> State: INITIALIZED App_context: 0 > >>> Locale: UNKNOWN > >>> Binding: [./B/./././.] > >>> Data for proc: [[51718,1],46] > >>> Pid: 0 Local rank: 2 Node rank: 2 App rank: 46 > >>> State: INITIALIZED App_context: 0 > >>> Locale: UNKNOWN > >>> Binding: [././B/././.] > >>> Data for proc: [[51718,1],47] > >>> Pid: 0 Local rank: 3 Node rank: 3 App rank: 47 > >>> State: INITIALIZED App_context: 0 > >>> Locale: UNKNOWN > >>> Binding: [./././B/./.] > >>> Data for proc: [[51718,1],48] > >>> Pid: 0 Local rank: 4 Node rank: 4 App rank: 48 > >>> State: INITIALIZED App_context: 0 > >>> Locale: UNKNOWN > >>> Binding: [././././B/.] > >>> Data for proc: [[51718,1],49] > >>> Pid: 0 Local rank: 5 Node rank: 5 App rank: 49 > >>> State: INITIALIZED App_context: 0 > >>> Locale: UNKNOWN > >>> Binding: [./././././B] > >>> > >>> Data for node: csclprd3-0-6 Launch id: -1 State: 0 > >>> Daemon: [[51718,0],9] Daemon launched: True > >>> Num slots: 6 Slots in use: 6 Oversubscribed: FALSE > >>> Num slots allocated: 6 Max slots: 0 > >>> Username on node: NULL > >>> Num procs: 6 Next node_rank: 6 > >>> Data for proc: [[51718,1],50] > >>> Pid: 0 Local rank: 0 Node rank: 0 App rank: 50 > >>> State: INITIALIZED App_context: 0 > >>> Locale: UNKNOWN > >>> Binding: [B/././././.] > >>> Data for proc: [[51718,1],51] > >>> Pid: 0 Local rank: 1 Node rank: 1 App rank: 51 > >>> State: INITIALIZED App_context: 0 > >>> Locale: UNKNOWN > >>> Binding: [./B/./././.] > >>> Data for proc: [[51718,1],52] > >>> Pid: 0 Local rank: 2 Node rank: 2 App rank: 52 > >>> State: INITIALIZED App_context: 0 > >>> Locale: UNKNOWN > >>> Binding: [././B/././.] > >>> Data for proc: [[51718,1],53] > >>> Pid: 0 Local rank: 3 Node rank: 3 App rank: 53 > >>> State: INITIALIZED App_context: 0 > >>> Locale: UNKNOWN > >>> Binding: [./././B/./.] > >>> Data for proc: [[51718,1],54] > >>> Pid: 0 Local rank: 4 Node rank: 4 App rank: 54 > >>> State: INITIALIZED App_context: 0 > >>> Locale: UNKNOWN > >>> Binding: [././././B/.] > >>> Data for proc: [[51718,1],55] > >>> Pid: 0 Local rank: 5 Node rank: 5 App rank: 55 > >>> State: INITIALIZED App_context: 0 > >>> Locale: UNKNOWN > >>> Binding: [./././././B] > >>> > >>> Data for node: csclprd3-0-7 Launch id: -1 State: 0 > >>> Daemon: [[51718,0],10] Daemon launched: True > >>> Num slots: 16 Slots in use: 16 Oversubscribed: FALSE > >>> Num slots allocated: 16 Max slots: 0 > >>> Username on node: NULL > >>> Num procs: 16 Next node_rank: 16 > >>> Data for proc: [[51718,1],56] > >>> Pid: 0 Local rank: 0 Node rank: 0 App rank: 56 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] > >>> Binding: [BB/../../../../../../..][../../../../../../../..] > >>> Data for proc: [[51718,1],57] > >>> Pid: 0 Local rank: 1 Node rank: 1 App rank: 57 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] > >>> Binding: [../../../../../../../..][BB/../../../../../../..] > >>> Data for proc: [[51718,1],58] > >>> Pid: 0 Local rank: 2 Node rank: 2 App rank: 58 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] > >>> Binding: [../BB/../../../../../..][../../../../../../../..] > >>> Data for proc: [[51718,1],59] > >>> Pid: 0 Local rank: 3 Node rank: 3 App rank: 59 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] > >>> Binding: [../../../../../../../..][../BB/../../../../../..] > >>> Data for proc: [[51718,1],60] > >>> Pid: 0 Local rank: 4 Node rank: 4 App rank: 60 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] > >>> Binding: [../../BB/../../../../..][../../../../../../../..] > >>> Data for proc: [[51718,1],61] > >>> Pid: 0 Local rank: 5 Node rank: 5 App rank: 61 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] > >>> Binding: [../../../../../../../..][../../BB/../../../../..] > >>> Data for proc: [[51718,1],62] > >>> Pid: 0 Local rank: 6 Node rank: 6 App rank: 62 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] > >>> Binding: [../../../BB/../../../..][../../../../../../../..] > >>> Data for proc: [[51718,1],63] > >>> Pid: 0 Local rank: 7 Node rank: 7 App rank: 63 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] > >>> Binding: [../../../../../../../..][../../../BB/../../../..] > >>> Data for proc: [[51718,1],64] > >>> Pid: 0 Local rank: 8 Node rank: 8 App rank: 64 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] > >>> Binding: [../../../../BB/../../..][../../../../../../../..] > >>> Data for proc: [[51718,1],65] > >>> Pid: 0 Local rank: 9 Node rank: 9 App rank: 65 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] > >>> Binding: [../../../../../../../..][../../../../BB/../../..] > >>> Data for proc: [[51718,1],66] > >>> Pid: 0 Local rank: 10 Node rank: 10 App rank: 66 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] > >>> Binding: [../../../../../BB/../..][../../../../../../../..] > >>> Data for proc: [[51718,1],67] > >>> Pid: 0 Local rank: 11 Node rank: 11 App rank: 67 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] > >>> Binding: [../../../../../../../..][../../../../../BB/../..] > >>> Data for proc: [[51718,1],68] > >>> Pid: 0 Local rank: 12 Node rank: 12 App rank: 68 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] > >>> Binding: [../../../../../../BB/..][../../../../../../../..] > >>> Data for proc: [[51718,1],69] > >>> Pid: 0 Local rank: 13 Node rank: 13 App rank: 69 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] > >>> Binding: [../../../../../../../..][../../../../../../BB/..] > >>> Data for proc: [[51718,1],70] > >>> Pid: 0 Local rank: 14 Node rank: 14 App rank: 70 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] > >>> Binding: [../../../../../../../BB][../../../../../../../..] > >>> Data for proc: [[51718,1],71] > >>> Pid: 0 Local rank: 15 Node rank: 15 App rank: 71 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] > >>> Binding: [../../../../../../../..][../../../../../../../BB] > >>> > >>> Data for node: csclprd3-0-8 Launch id: -1 State: 0 > >>> Daemon: [[51718,0],11] Daemon launched: True > >>> Num slots: 16 Slots in use: 16 Oversubscribed: FALSE > >>> Num slots allocated: 16 Max slots: 0 > >>> Username on node: NULL > >>> Num procs: 16 Next node_rank: 16 > >>> Data for proc: [[51718,1],72] > >>> Pid: 0 Local rank: 0 Node rank: 0 App rank: 72 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] > >>> Binding: [BB/../../../../../../..][../../../../../../../..] > >>> Data for proc: [[51718,1],73] > >>> Pid: 0 Local rank: 1 Node rank: 1 App rank: 73 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] > >>> Binding: [../../../../../../../..][BB/../../../../../../..] > >>> Data for proc: [[51718,1],74] > >>> Pid: 0 Local rank: 2 Node rank: 2 App rank: 74 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] > >>> Binding: [../BB/../../../../../..][../../../../../../../..] > >>> Data for proc: [[51718,1],75] > >>> Pid: 0 Local rank: 3 Node rank: 3 App rank: 75 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] > >>> Binding: [../../../../../../../..][../BB/../../../../../..] > >>> Data for proc: [[51718,1],76] > >>> Pid: 0 Local rank: 4 Node rank: 4 App rank: 76 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] > >>> Binding: [../../BB/../../../../..][../../../../../../../..] > >>> Data for proc: [[51718,1],77] > >>> Pid: 0 Local rank: 5 Node rank: 5 App rank: 77 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] > >>> Binding: [../../../../../../../..][../../BB/../../../../..] > >>> Data for proc: [[51718,1],78] > >>> Pid: 0 Local rank: 6 Node rank: 6 App rank: 78 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] > >>> Binding: [../../../BB/../../../..][../../../../../../../..] > >>> Data for proc: [[51718,1],79] > >>> Pid: 0 Local rank: 7 Node rank: 7 App rank: 79 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] > >>> Binding: [../../../../../../../..][../../../BB/../../../..] > >>> Data for proc: [[51718,1],80] > >>> Pid: 0 Local rank: 8 Node rank: 8 App rank: 80 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] > >>> Binding: [../../../../BB/../../..][../../../../../../../..] > >>> Data for proc: [[51718,1],81] > >>> Pid: 0 Local rank: 9 Node rank: 9 App rank: 81 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] > >>> Binding: [../../../../../../../..][../../../../BB/../../..] > >>> Data for proc: [[51718,1],82] > >>> Pid: 0 Local rank: 10 Node rank: 10 App rank: 82 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] > >>> Binding: [../../../../../BB/../..][../../../../../../../..] > >>> Data for proc: [[51718,1],83] > >>> Pid: 0 Local rank: 11 Node rank: 11 App rank: 83 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] > >>> Binding: [../../../../../../../..][../../../../../BB/../..] > >>> Data for proc: [[51718,1],84] > >>> Pid: 0 Local rank: 12 Node rank: 12 App rank: 84 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] > >>> Binding: [../../../../../../BB/..][../../../../../../../..] > >>> Data for proc: [[51718,1],85] > >>> Pid: 0 Local rank: 13 Node rank: 13 App rank: 85 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] > >>> Binding: [../../../../../../../..][../../../../../../BB/..] > >>> Data for proc: [[51718,1],86] > >>> Pid: 0 Local rank: 14 Node rank: 14 App rank: 86 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] > >>> Binding: [../../../../../../../BB][../../../../../../../..] > >>> Data for proc: [[51718,1],87] > >>> Pid: 0 Local rank: 15 Node rank: 15 App rank: 87 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] > >>> Binding: [../../../../../../../..][../../../../../../../BB] > >>> > >>> Data for node: csclprd3-0-10 Launch id: -1 State: 0 > >>> Daemon: [[51718,0],12] Daemon launched: True > >>> Num slots: 16 Slots in use: 16 Oversubscribed: FALSE > >>> Num slots allocated: 16 Max slots: 0 > >>> Username on node: NULL > >>> Num procs: 16 Next node_rank: 16 > >>> Data for proc: [[51718,1],88] > >>> Pid: 0 Local rank: 0 Node rank: 0 App rank: 88 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] > >>> Binding: [BB/../../../../../../..][../../../../../../../..] > >>> Data for proc: [[51718,1],89] > >>> Pid: 0 Local rank: 1 Node rank: 1 App rank: 89 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] > >>> Binding: [../../../../../../../..][BB/../../../../../../..] > >>> Data for proc: [[51718,1],90] > >>> Pid: 0 Local rank: 2 Node rank: 2 App rank: 90 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] > >>> Binding: [../BB/../../../../../..][../../../../../../../..] > >>> Data for proc: [[51718,1],91] > >>> Pid: 0 Local rank: 3 Node rank: 3 App rank: 91 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] > >>> Binding: [../../../../../../../..][../BB/../../../../../..] > >>> Data for proc: [[51718,1],92] > >>> Pid: 0 Local rank: 4 Node rank: 4 App rank: 92 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] > >>> Binding: [../../BB/../../../../..][../../../../../../../..] > >>> Data for proc: [[51718,1],93] > >>> Pid: 0 Local rank: 5 Node rank: 5 App rank: 93 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] > >>> Binding: [../../../../../../../..][../../BB/../../../../..] > >>> Data for proc: [[51718,1],94] > >>> Pid: 0 Local rank: 6 Node rank: 6 App rank: 94 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] > >>> Binding: [../../../BB/../../../..][../../../../../../../..] > >>> Data for proc: [[51718,1],95] > >>> Pid: 0 Local rank: 7 Node rank: 7 App rank: 95 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] > >>> Binding: [../../../../../../../..][../../../BB/../../../..] > >>> Data for proc: [[51718,1],96] > >>> Pid: 0 Local rank: 8 Node rank: 8 App rank: 96 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] > >>> Binding: [../../../../BB/../../..][../../../../../../../..] > >>> Data for proc: [[51718,1],97] > >>> Pid: 0 Local rank: 9 Node rank: 9 App rank: 97 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] > >>> Binding: [../../../../../../../..][../../../../BB/../../..] > >>> Data for proc: [[51718,1],98] > >>> Pid: 0 Local rank: 10 Node rank: 10 App rank: 98 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] > >>> Binding: [../../../../../BB/../..][../../../../../../../..] > >>> Data for proc: [[51718,1],99] > >>> Pid: 0 Local rank: 11 Node rank: 11 App rank: 99 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] > >>> Binding: [../../../../../../../..][../../../../../BB/../..] > >>> Data for proc: [[51718,1],100] > >>> Pid: 0 Local rank: 12 Node rank: 12 App rank: 100 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] > >>> Binding: [../../../../../../BB/..][../../../../../../../..] > >>> Data for proc: [[51718,1],101] > >>> Pid: 0 Local rank: 13 Node rank: 13 App rank: 101 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] > >>> Binding: [../../../../../../../..][../../../../../../BB/..] > >>> Data for proc: [[51718,1],102] > >>> Pid: 0 Local rank: 14 Node rank: 14 App rank: 102 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] > >>> Binding: [../../../../../../../BB][../../../../../../../..] > >>> Data for proc: [[51718,1],103] > >>> Pid: 0 Local rank: 15 Node rank: 15 App rank: 103 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] > >>> Binding: [../../../../../../../..][../../../../../../../BB] > >>> > >>> Data for node: csclprd3-0-11 Launch id: -1 State: 0 > >>> Daemon: [[51718,0],13] Daemon launched: True > >>> Num slots: 16 Slots in use: 16 Oversubscribed: FALSE > >>> Num slots allocated: 16 Max slots: 0 > >>> Username on node: NULL > >>> Num procs: 16 Next node_rank: 16 > >>> Data for proc: [[51718,1],104] > >>> Pid: 0 Local rank: 0 Node rank: 0 App rank: 104 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] > >>> Binding: [BB/../../../../../../..][../../../../../../../..] > >>> Data for proc: [[51718,1],105] > >>> Pid: 0 Local rank: 1 Node rank: 1 App rank: 105 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] > >>> Binding: [../../../../../../../..][BB/../../../../../../..] > >>> Data for proc: [[51718,1],106] > >>> Pid: 0 Local rank: 2 Node rank: 2 App rank: 106 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] > >>> Binding: [../BB/../../../../../..][../../../../../../../..] > >>> Data for proc: [[51718,1],107] > >>> Pid: 0 Local rank: 3 Node rank: 3 App rank: 107 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] > >>> Binding: [../../../../../../../..][../BB/../../../../../..] > >>> Data for proc: [[51718,1],108] > >>> Pid: 0 Local rank: 4 Node rank: 4 App rank: 108 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] > >>> Binding: [../../BB/../../../../..][../../../../../../../..] > >>> Data for proc: [[51718,1],109] > >>> Pid: 0 Local rank: 5 Node rank: 5 App rank: 109 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] > >>> Binding: [../../../../../../../..][../../BB/../../../../..] > >>> Data for proc: [[51718,1],110] > >>> Pid: 0 Local rank: 6 Node rank: 6 App rank: 110 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] > >>> Binding: [../../../BB/../../../..][../../../../../../../..] > >>> Data for proc: [[51718,1],111] > >>> Pid: 0 Local rank: 7 Node rank: 7 App rank: 111 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] > >>> Binding: [../../../../../../../..][../../../BB/../../../..] > >>> Data for proc: [[51718,1],112] > >>> Pid: 0 Local rank: 8 Node rank: 8 App rank: 112 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] > >>> Binding: [../../../../BB/../../..][../../../../../../../..] > >>> Data for proc: [[51718,1],113] > >>> Pid: 0 Local rank: 9 Node rank: 9 App rank: 113 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] > >>> Binding: [../../../../../../../..][../../../../BB/../../..] > >>> Data for proc: [[51718,1],114] > >>> Pid: 0 Local rank: 10 Node rank: 10 App rank: 114 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] > >>> Binding: [../../../../../BB/../..][../../../../../../../..] > >>> Data for proc: [[51718,1],115] > >>> Pid: 0 Local rank: 11 Node rank: 11 App rank: 115 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] > >>> Binding: [../../../../../../../..][../../../../../BB/../..] > >>> Data for proc: [[51718,1],116] > >>> Pid: 0 Local rank: 12 Node rank: 12 App rank: 116 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] > >>> Binding: [../../../../../../BB/..][../../../../../../../..] > >>> Data for proc: [[51718,1],117] > >>> Pid: 0 Local rank: 13 Node rank: 13 App rank: 117 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] > >>> Binding: [../../../../../../../..][../../../../../../BB/..] > >>> Data for proc: [[51718,1],118] > >>> Pid: 0 Local rank: 14 Node rank: 14 App rank: 118 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] > >>> Binding: [../../../../../../../BB][../../../../../../../..] > >>> Data for proc: [[51718,1],119] > >>> Pid: 0 Local rank: 15 Node rank: 15 App rank: 119 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] > >>> Binding: [../../../../../../../..][../../../../../../../BB] > >>> > >>> Data for node: csclprd3-0-12 Launch id: -1 State: 0 > >>> Daemon: [[51718,0],14] Daemon launched: True > >>> Num slots: 6 Slots in use: 6 Oversubscribed: FALSE > >>> Num slots allocated: 6 Max slots: 0 > >>> Username on node: NULL > >>> Num procs: 6 Next node_rank: 6 > >>> Data for proc: [[51718,1],120] > >>> Pid: 0 Local rank: 0 Node rank: 0 App rank: 120 > >>> State: INITIALIZED App_context: 0 > >>> Locale: UNKNOWN > >>> Binding: [BB/../../../../..] > >>> Data for proc: [[51718,1],121] > >>> Pid: 0 Local rank: 1 Node rank: 1 App rank: 121 > >>> State: INITIALIZED App_context: 0 > >>> Locale: UNKNOWN > >>> Binding: [../BB/../../../..] > >>> Data for proc: [[51718,1],122] > >>> Pid: 0 Local rank: 2 Node rank: 2 App rank: 122 > >>> State: INITIALIZED App_context: 0 > >>> Locale: UNKNOWN > >>> Binding: [../../BB/../../..] > >>> Data for proc: [[51718,1],123] > >>> Pid: 0 Local rank: 3 Node rank: 3 App rank: 123 > >>> State: INITIALIZED App_context: 0 > >>> Locale: UNKNOWN > >>> Binding: [../../../BB/../..] > >>> Data for proc: [[51718,1],124] > >>> Pid: 0 Local rank: 4 Node rank: 4 App rank: 124 > >>> State: INITIALIZED App_context: 0 > >>> Locale: UNKNOWN > >>> Binding: [../../../../BB/..] > >>> Data for proc: [[51718,1],125] > >>> Pid: 0 Local rank: 5 Node rank: 5 App rank: 125 > >>> State: INITIALIZED App_context: 0 > >>> Locale: UNKNOWN > >>> Binding: [../../../../../BB] > >>> > >>> Data for node: csclprd3-0-13 Launch id: -1 State: 0 > >>> Daemon: [[51718,0],15] Daemon launched: True > >>> Num slots: 12 Slots in use: 6 Oversubscribed: FALSE > >>> Num slots allocated: 12 Max slots: 0 > >>> Username on node: NULL > >>> Num procs: 6 Next node_rank: 6 > >>> Data for proc: [[51718,1],126] > >>> Pid: 0 Local rank: 0 Node rank: 0 App rank: 126 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [BB/BB/BB/BB/BB/BB][../../../../../..] > >>> Binding: [BB/../../../../..][../../../../../..] > >>> Data for proc: [[51718,1],127] > >>> Pid: 0 Local rank: 1 Node rank: 1 App rank: 127 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [../../../../../..][BB/BB/BB/BB/BB/BB] > >>> Binding: [../../../../../..][BB/../../../../..] > >>> Data for proc: [[51718,1],128] > >>> Pid: 0 Local rank: 2 Node rank: 2 App rank: 128 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [BB/BB/BB/BB/BB/BB][../../../../../..] > >>> Binding: [../BB/../../../..][../../../../../..] > >>> Data for proc: [[51718,1],129] > >>> Pid: 0 Local rank: 3 Node rank: 3 App rank: 129 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [../../../../../..][BB/BB/BB/BB/BB/BB] > >>> Binding: [../../../../../..][../BB/../../../..] > >>> Data for proc: [[51718,1],130] > >>> Pid: 0 Local rank: 4 Node rank: 4 App rank: 130 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [BB/BB/BB/BB/BB/BB][../../../../../..] > >>> Binding: [../../BB/../../..][../../../../../..] > >>> Data for proc: [[51718,1],131] > >>> Pid: 0 Local rank: 5 Node rank: 5 App rank: 131 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [../../../../../..][BB/BB/BB/BB/BB/BB] > >>> Binding: [../../../../../..][../../BB/../../..] > >>> [csclprd3-0-13:31619] *** Process received signal *** > >>> [csclprd3-0-13:31619] Signal: Bus error (7) > >>> [csclprd3-0-13:31619] Signal code: Non-existant physical address (2) > >>> [csclprd3-0-13:31619] Failing at address: 0x7f1374267a00 > >>> [csclprd3-0-13:31620] *** Process received signal *** > >>> [csclprd3-0-13:31620] Signal: Bus error (7) > >>> [csclprd3-0-13:31620] Signal code: Non-existant physical address (2) > >>> [csclprd3-0-13:31620] Failing at address: 0x7fcc702a7980 > >>> [csclprd3-0-13:31615] *** Process received signal *** > >>> [csclprd3-0-13:31615] Signal: Bus error (7) > >>> [csclprd3-0-13:31615] Signal code: Non-existant physical address (2) > >>> [csclprd3-0-13:31615] Failing at address: 0x7f8128367880 > >>> [csclprd3-0-13:31616] *** Process received signal *** > >>> [csclprd3-0-13:31616] Signal: Bus error (7) > >>> [csclprd3-0-13:31616] Signal code: Non-existant physical address (2) > >>> [csclprd3-0-13:31616] Failing at address: 0x7fe674227a00 > >>> [csclprd3-0-13:31617] *** Process received signal *** > >>> [csclprd3-0-13:31617] Signal: Bus error (7) > >>> [csclprd3-0-13:31617] Signal code: Non-existant physical address (2) > >>> [csclprd3-0-13:31617] Failing at address: 0x7f061c32db80 > >>> [csclprd3-0-13:31618] *** Process received signal *** > >>> [csclprd3-0-13:31618] Signal: Bus error (7) > >>> [csclprd3-0-13:31618] Signal code: Non-existant physical address (2) > >>> [csclprd3-0-13:31618] Failing at address: 0x7fb8402eaa80 > >>> [csclprd3-0-13:31618] [ 0] > /lib64/libpthread.so.0(+0xf500)[0x7fb851851500] > >>> [csclprd3-0-13:31618] [ 1] [csclprd3-0-13:31616] [ 0] > /lib64/libpthread.so.0(+0xf500)[0x7fe6843a4500] > >>> [csclprd3-0-13:31616] [ 1] [csclprd3-0-13:31620] [ 0] > /lib64/libpthread.so.0(+0xf500)[0x7fcc80c54500] > >>> [csclprd3-0-13:31620] [ 1] > /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x167f61)[0x7fcc80fc9f61] > >>> [csclprd3-0-13:31620] [ 2] > /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x168047)[0x7fcc80fca047] > >>> [csclprd3-0-13:31620] [ 3] [csclprd3-0-13:31615] [ 0] > /lib64/libpthread.so.0(+0xf500)[0x7f81385ca500] > >>> [csclprd3-0-13:31615] [ 1] > /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x167f61)[0x7f813893ff61] > >>> [csclprd3-0-13:31615] [ 2] > /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x168047)[0x7f8138940047] > >>> [csclprd3-0-13:31615] [ 3] > /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x167f61)[0x7fb851bc6f61] > >>> [csclprd3-0-13:31618] [ 2] > /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x168047)[0x7fb851bc7047] > >>> [csclprd3-0-13:31618] [ 3] > /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x55670)[0x7fb851ab4670] > >>> [csclprd3-0-13:31618] [ 4] [csclprd3-0-13:31617] [ 0] > /lib64/libpthread.so.0(+0xf500)[0x7f062cfe5500] > >>> [csclprd3-0-13:31617] [ 1] > /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x167f61)[0x7f062d35af61] > >>> [csclprd3-0-13:31617] [ 2] > /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x168047)[0x7f062d35b047] > >>> [csclprd3-0-13:31617] [ 3] > /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x55670)[0x7f062d248670] > >>> [csclprd3-0-13:31617] [ 4] [csclprd3-0-13:31619] [ 0] > /lib64/libpthread.so.0(+0xf500)[0x7f1384fde500] > >>> [csclprd3-0-13:31619] [ 1] > /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x167f61)[0x7f1385353f61] > >>> [csclprd3-0-13:31619] [ 2] > /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x167f61)[0x7fe684719f61] > >>> [csclprd3-0-13:31616] [ 2] > /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x168047)[0x7fe68471a047] > >>> [csclprd3-0-13:31616] [ 3] > /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x55670)[0x7fe684607670] > >>> [csclprd3-0-13:31616] [ 4] > /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x168047)[0x7f1385354047] > >>> [csclprd3-0-13:31619] [ 3] > /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x55670)[0x7f1385241670] > >>> [csclprd3-0-13:31619] [ 4] > /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(ompi_free_list_grow+0x3b9)[0x7f13852425ab] > >>> [csclprd3-0-13:31619] [ 5] > /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(ompi_free_list_resize_mt+0xfb)[0x7f1385242751] > >>> [csclprd3-0-13:31619] [ 6] > /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(mca_btl_sm_add_procs+0x671)[0x7f13853501c9] > >>> [csclprd3-0-13:31619] [ 7] > /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x14a628)[0x7f1385336628] > >>> [csclprd3-0-13:31619] [ 8] > /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x55670)[0x7fcc80eb7670] > >>> [csclprd3-0-13:31620] [ 4] > /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(ompi_free_list_grow+0x3b9)[0x7fcc80eb85ab] > >>> [csclprd3-0-13:31620] [ 5] > /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(ompi_free_list_resize_mt+0xfb)[0x7fcc80eb8751] > >>> [csclprd3-0-13:31620] [ 6] > /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(mca_btl_sm_add_procs+0x671)[0x7fcc80fc61c9] > >>> [csclprd3-0-13:31620] [ 7] > /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x14a628)[0x7fcc80fac628] > >>> [csclprd3-0-13:31620] [ 8] > /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(mca_pml_ob1_add_procs+0xff)[0x7fcc8111fd61] > >>> [csclprd3-0-13:31620] [ 9] > /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x55670)[0x7f813882d670] > >>> [csclprd3-0-13:31615] [ 4] > /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(ompi_free_list_grow+0x3b9)[0x7f813882e5ab] > >>> [csclprd3-0-13:31615] [ 5] > /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(ompi_free_list_resize_mt+0xfb)[0x7f813882e751] > >>> [csclprd3-0-13:31615] [ 6] > /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(mca_btl_sm_add_procs+0x671)[0x7f813893c1c9] > >>> [csclprd3-0-13:31615] [ 7] > /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x14a628)[0x7f8138922628] > >>> [csclprd3-0-13:31615] [ 8] > /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(mca_pml_ob1_add_procs+0xff)[0x7f8138a95d61] > >>> [csclprd3-0-13:31615] [ 9] > /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(ompi_mpi_init+0xbda)[0x7f813885d747] > >>> [csclprd3-0-13:31615] [10] > /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(ompi_free_list_grow+0x3b9)[0x7fb851ab55ab] > >>> [csclprd3-0-13:31618] [ 5] > /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(ompi_free_list_resize_mt+0xfb)[0x7fb851ab5751] > >>> [csclprd3-0-13:31618] [ 6] > /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(mca_btl_sm_add_procs+0x671)[0x7fb851bc31c9] > >>> [csclprd3-0-13:31618] [ 7] > /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x14a628)[0x7fb851ba9628] > >>> [csclprd3-0-13:31618] [ 8] > /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(mca_pml_ob1_add_procs+0xff)[0x7fb851d1cd61] > >>> [csclprd3-0-13:31618] [ 9] > /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(ompi_mpi_init+0xbda)[0x7fb851ae4747] > >>> [csclprd3-0-13:31618] [10] > /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(ompi_free_list_grow+0x3b9)[0x7f062d2495ab] > >>> [csclprd3-0-13:31617] [ 5] > /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(ompi_free_list_resize_mt+0xfb)[0x7f062d249751] > >>> [csclprd3-0-13:31617] [ 6] > /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(mca_btl_sm_add_procs+0x671)[0x7f062d3571c9] > >>> [csclprd3-0-13:31617] [ 7] > /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x14a628)[0x7f062d33d628] > >>> [csclprd3-0-13:31617] [ 8] > /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(mca_pml_ob1_add_procs+0xff)[0x7f062d4b0d61] > >>> [csclprd3-0-13:31617] [ 9] > /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(ompi_mpi_init+0xbda)[0x7f062d278747] > >>> [csclprd3-0-13:31617] [10] > /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(ompi_free_list_grow+0x3b9)[0x7fe6846085ab] > >>> [csclprd3-0-13:31616] [ 5] > /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(ompi_free_list_resize_mt+0xfb)[0x7fe684608751] > >>> [csclprd3-0-13:31616] [ 6] > /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(mca_btl_sm_add_procs+0x671)[0x7fe6847161c9] > >>> [csclprd3-0-13:31616] [ 7] > /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x14a628)[0x7fe6846fc628] > >>> [csclprd3-0-13:31616] [ 8] > /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(mca_pml_ob1_add_procs+0xff)[0x7fe68486fd61] > >>> [csclprd3-0-13:31616] [ 9] > /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(ompi_mpi_init+0xbda)[0x7fe684637747] > >>> [csclprd3-0-13:31616] [10] > /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(MPI_Init+0x185)[0x7fe68467750b] > >>> [csclprd3-0-13:31616] [11] > /hpc/home/lanew/mpi/openmpi/ProcessColors3[0x400ad0] > >>> [csclprd3-0-13:31616] [12] > /lib64/libc.so.6(__libc_start_main+0xfd)[0x7fe684021cdd] > >>> [csclprd3-0-13:31616] [13] > /hpc/home/lanew/mpi/openmpi/ProcessColors3[0x400999] > >>> [csclprd3-0-13:31616] *** End of error message *** > >>> > /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(MPI_Init+0x185)[0x7f062d2b850b] > >>> [csclprd3-0-13:31617] [11] > /hpc/home/lanew/mpi/openmpi/ProcessColors3[0x400ad0] > >>> [csclprd3-0-13:31617] [12] > /lib64/libc.so.6(__libc_start_main+0xfd)[0x7f062cc62cdd] > >>> [csclprd3-0-13:31617] [13] > /hpc/home/lanew/mpi/openmpi/ProcessColors3[0x400999] > >>> [csclprd3-0-13:31617] *** End of error message *** > >>> > /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(mca_pml_ob1_add_procs+0xff)[0x7f13854a9d61] > >>> [csclprd3-0-13:31619] [ 9] > /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(ompi_mpi_init+0xbda)[0x7f1385271747] > >>> [csclprd3-0-13:31619] [10] > /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(MPI_Init+0x185)[0x7f13852b150b] > >>> [csclprd3-0-13:31619] [11] > /hpc/home/lanew/mpi/openmpi/ProcessColors3[0x400ad0] > >>> [csclprd3-0-13:31619] [12] > /lib64/libc.so.6(__libc_start_main+0xfd)[0x7f1384c5bcdd] > >>> [csclprd3-0-13:31619] [13] > /hpc/home/lanew/mpi/openmpi/ProcessColors3[0x400999] > >>> [csclprd3-0-13:31619] *** End of error message *** > >>> > /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(ompi_mpi_init+0xbda)[0x7fcc80ee7747] > >>> [csclprd3-0-13:31620] [10] > /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(MPI_Init+0x185)[0x7fcc80f2750b] > >>> [csclprd3-0-13:31620] [11] > /hpc/home/lanew/mpi/openmpi/ProcessColors3[0x400ad0] > >>> [csclprd3-0-13:31620] [12] > /lib64/libc.so.6(__libc_start_main+0xfd)[0x7fcc808d1cdd] > >>> [csclprd3-0-13:31620] [13] > /hpc/home/lanew/mpi/openmpi/ProcessColors3[0x400999] > >>> [csclprd3-0-13:31620] *** End of error message *** > >>> > /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(MPI_Init+0x185)[0x7f813889d50b] > >>> [csclprd3-0-13:31615] [11] > /hpc/home/lanew/mpi/openmpi/ProcessColors3[0x400ad0] > >>> [csclprd3-0-13:31615] [12] > /lib64/libc.so.6(__libc_start_main+0xfd)[0x7f8138247cdd] > >>> [csclprd3-0-13:31615] [13] > /hpc/home/lanew/mpi/openmpi/ProcessColors3[0x400999] > >>> [csclprd3-0-13:31615] *** End of error message *** > >>> > /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(MPI_Init+0x185)[0x7fb851b2450b] > >>> [csclprd3-0-13:31618] [11] > /hpc/home/lanew/mpi/openmpi/ProcessColors3[0x400ad0] > >>> [csclprd3-0-13:31618] [12] > /lib64/libc.so.6(__libc_start_main+0xfd)[0x7fb8514cecdd] > >>> [csclprd3-0-13:31618] [13] > /hpc/home/lanew/mpi/openmpi/ProcessColors3[0x400999] > >>> [csclprd3-0-13:31618] *** End of error message *** > >>> > -------------------------------------------------------------------------- > >>> mpirun noticed that process rank 126 with PID 0 on node csclprd3-0-13 > exited on signal 7 (Bus error). > >>> > -------------------------------------------------------------------------- > >>> > >>> From: users [users-boun...@open-mpi.org] on behalf of Ralph Castain [ > r...@open-mpi.org] > >>> Sent: Tuesday, June 23, 2015 6:20 PM > >>> To: Open MPI Users > >>> Subject: Re: [OMPI users] OpenMPI 1.8.6, CentOS 6.3, too many slots = > crash > >>> > >>> Wow - that is one sick puppy! I see that some nodes are reporting > not-bound for their procs, and the rest are binding to socket (as they > should). Some of your nodes clearly do not have hyper threads enabled (or > only have single-thread cores on them), and have 2 cores/socket. Other > nodes have 8 cores/socket with hyper threads enabled, while still others > have 6 cores/socket and HT enabled. > >>> > >>> I don't see anyone binding to a single HT if multiple HTs/core are > available. I think you are being fooled by those nodes that either don't > have HT enabled, or have only 1 HT/core. > >>> > >>> In both cases, it is node 13 that is the node that fails. I also note > that you said everything works okay with < 132 ranks, and node 13 hosts > ranks 127-131. So node 13 would host ranks even if you reduced the number > in the job to 131. This would imply that it probably isn't something wrong > with the node itself. > >>> > >>> Is there any way you could run a job of this size on a homogeneous > cluster? The procs all show bindings that look right, but I'm wondering if > the heterogeneity is the source of the trouble here. We may be > communicating the binding pattern incorrectly and giving bad info to the > backend daemon. > >>> > >>> Also, rather than --report-bindings, use "--display-devel-map" on the > command line and let's see what the mapper thinks it did. If there is a > problem with placement, that is where it would exist. > >>> > >>> > >>> On Tue, Jun 23, 2015 at 5:12 PM, Lane, William <william.l...@cshs.org> > wrote: > >>> Ralph, > >>> > >>> There is something funny going on, the trace from the > >>> runs w/the debug build aren't showing any differences from > >>> what I got earlier. However, I did do a run w/the --bind-to core > >>> switch and was surprised to see that hyperthreading cores were > >>> sometimes being used. > >>> > >>> Here's the traces that I have: > >>> > >>> mpirun -np 132 -report-bindings --prefix /hpc/apps/mpi/openmpi/1.8.6/ > --hostfile hostfile-noslots --mca btl_tcp_if_include eth0 --hetero-nodes > /hpc/home/lanew/mpi/openmpi/ProcessColors3 > >>> [csclprd3-0-5:16802] MCW rank 44 is not bound (or bound to all > available processors) > >>> [csclprd3-0-5:16802] MCW rank 45 is not bound (or bound to all > available processors) > >>> [csclprd3-0-5:16802] MCW rank 46 is not bound (or bound to all > available processors) > >>> [csclprd3-6-5:12480] MCW rank 4 bound to socket 0[core 0[hwt 0]], > socket 0[core 1[hwt 0]]: [B/B][./.] > >>> [csclprd3-6-5:12480] MCW rank 5 bound to socket 1[core 2[hwt 0]], > socket 1[core 3[hwt 0]]: [./.][B/B] > >>> [csclprd3-6-5:12480] MCW rank 6 bound to socket 0[core 0[hwt 0]], > socket 0[core 1[hwt 0]]: [B/B][./.] > >>> [csclprd3-6-5:12480] MCW rank 7 bound to socket 1[core 2[hwt 0]], > socket 1[core 3[hwt 0]]: [./.][B/B] > >>> [csclprd3-0-5:16802] MCW rank 47 is not bound (or bound to all > available processors) > >>> [csclprd3-0-5:16802] MCW rank 48 is not bound (or bound to all > available processors) > >>> [csclprd3-0-5:16802] MCW rank 49 is not bound (or bound to all > available processors) > >>> [csclprd3-0-1:14318] MCW rank 22 is not bound (or bound to all > available processors) > >>> [csclprd3-0-1:14318] MCW rank 23 is not bound (or bound to all > available processors) > >>> [csclprd3-0-1:14318] MCW rank 24 is not bound (or bound to all > available processors) > >>> [csclprd3-6-1:24682] MCW rank 3 bound to socket 1[core 2[hwt 0]], > socket 1[core 3[hwt 0]]: [./.][B/B] > >>> [csclprd3-6-1:24682] MCW rank 0 bound to socket 0[core 0[hwt 0]], > socket 0[core 1[hwt 0]]: [B/B][./.] > >>> [csclprd3-0-1:14318] MCW rank 25 is not bound (or bound to all > available processors) > >>> [csclprd3-0-1:14318] MCW rank 20 is not bound (or bound to all > available processors) > >>> [csclprd3-0-3:13827] MCW rank 34 is not bound (or bound to all > available processors) > >>> [csclprd3-0-1:14318] MCW rank 21 is not bound (or bound to all > available processors) > >>> [csclprd3-0-3:13827] MCW rank 35 is not bound (or bound to all > available processors) > >>> [csclprd3-6-1:24682] MCW rank 1 bound to socket 1[core 2[hwt 0]], > socket 1[core 3[hwt 0]]: [./.][B/B] > >>> [csclprd3-0-3:13827] MCW rank 36 is not bound (or bound to all > available processors) > >>> [csclprd3-6-1:24682] MCW rank 2 bound to socket 0[core 0[hwt 0]], > socket 0[core 1[hwt 0]]: [B/B][./.] > >>> [csclprd3-0-6:30371] MCW rank 51 is not bound (or bound to all > available processors) > >>> [csclprd3-0-6:30371] MCW rank 52 is not bound (or bound to all > available processors) > >>> [csclprd3-0-6:30371] MCW rank 53 is not bound (or bound to all > available processors) > >>> [csclprd3-0-2:05825] MCW rank 30 is not bound (or bound to all > available processors) > >>> [csclprd3-0-6:30371] MCW rank 54 is not bound (or bound to all > available processors) > >>> [csclprd3-0-3:13827] MCW rank 37 is not bound (or bound to all > available processors) > >>> [csclprd3-0-2:05825] MCW rank 31 is not bound (or bound to all > available processors) > >>> [csclprd3-0-3:13827] MCW rank 32 is not bound (or bound to all > available processors) > >>> [csclprd3-0-6:30371] MCW rank 55 is not bound (or bound to all > available processors) > >>> [csclprd3-0-3:13827] MCW rank 33 is not bound (or bound to all > available processors) > >>> [csclprd3-0-6:30371] MCW rank 50 is not bound (or bound to all > available processors) > >>> [csclprd3-0-2:05825] MCW rank 26 is not bound (or bound to all > available processors) > >>> [csclprd3-0-2:05825] MCW rank 27 is not bound (or bound to all > available processors) > >>> [csclprd3-0-2:05825] MCW rank 28 is not bound (or bound to all > available processors) > >>> [csclprd3-0-2:05825] MCW rank 29 is not bound (or bound to all > available processors) > >>> [csclprd3-0-12:12383] MCW rank 121 is not bound (or bound to all > available processors) > >>> [csclprd3-0-12:12383] MCW rank 122 is not bound (or bound to all > available processors) > >>> [csclprd3-0-12:12383] MCW rank 123 is not bound (or bound to all > available processors) > >>> [csclprd3-0-12:12383] MCW rank 124 is not bound (or bound to all > available processors) > >>> [csclprd3-0-12:12383] MCW rank 125 is not bound (or bound to all > available processors) > >>> [csclprd3-0-12:12383] MCW rank 120 is not bound (or bound to all > available processors) > >>> [csclprd3-0-0:31079] MCW rank 13 bound to socket 1[core 6[hwt 0]], > socket 1[core 7[hwt 0]], socket 1[core 8[hwt 0]], socket 1[core 9[hwt 0]], > socket 1[core 10[hwt 0]], socket 1[core 11[hwt 0]]: > [./././././.][B/B/B/B/B/B] > >>> [csclprd3-0-0:31079] MCW rank 14 bound to socket 0[core 0[hwt 0]], > socket 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], socket 0[core 3[hwt 0]], > socket 0[core 4[hwt 0]], socket 0[core 5[hwt 0]]: [B/B/B/B/B/B][./././././.] > >>> [csclprd3-0-0:31079] MCW rank 15 bound to socket 1[core 6[hwt 0]], > socket 1[core 7[hwt 0]], socket 1[core 8[hwt 0]], socket 1[core 9[hwt 0]], > socket 1[core 10[hwt 0]], socket 1[core 11[hwt 0]]: > [./././././.][B/B/B/B/B/B] > >>> [csclprd3-0-0:31079] MCW rank 16 bound to socket 0[core 0[hwt 0]], > socket 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], socket 0[core 3[hwt 0]], > socket 0[core 4[hwt 0]], socket 0[core 5[hwt 0]]: [B/B/B/B/B/B][./././././.] > >>> [csclprd3-0-7:20515] MCW rank 68 bound to socket 0[core 0[hwt 0-1]], > socket 0[core 1[hwt 0-1]], socket 0[core 2[hwt 0-1]], socket 0[core 3[hwt > 0-1]], socket 0[core 4[hwt 0-1]], socket 0[core 5[hwt 0-1]], socket 0[core > 6[hwt 0-1]], socket 0[core 7[hwt 0-1]]: > [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] > >>> [csclprd3-0-10:19096] MCW rank 100 bound to socket 0[core 0[hwt 0-1]], > socket 0[core 1[hwt 0-1]], socket 0[core 2[hwt 0-1]], socket 0[core 3[hwt > 0-1]], socket 0[core 4[hwt 0-1]], socket 0[core 5[hwt 0-1]], socket 0[core > 6[hwt 0-1]], socket 0[core 7[hwt 0-1]]: > [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] > >>> [csclprd3-0-7:20515] MCW rank 69 bound to socket 1[core 8[hwt 0-1]], > socket 1[core 9[hwt 0-1]], socket 1[core 10[hwt 0-1]], socket 1[core 11[hwt > 0-1]], socket 1[core 12[hwt 0-1]], socket 1[core 13[hwt 0-1]], socket > 1[core 14[hwt 0-1]], socket 1[core 15[hwt 0-1]]: > [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] > >>> [csclprd3-0-10:19096] MCW rank 101 bound to socket 1[core 8[hwt 0-1]], > socket 1[core 9[hwt 0-1]], socket 1[core 10[hwt 0-1]], socket 1[core 11[hwt > 0-1]], socket 1[core 12[hwt 0-1]], socket 1[core 13[hwt 0-1]], socket > 1[core 14[hwt 0-1]], socket 1[core 15[hwt 0-1]]: > [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] > >>> [csclprd3-0-0:31079] MCW rank 17 bound to socket 1[core 6[hwt 0]], > socket 1[core 7[hwt 0]], socket 1[core 8[hwt 0]], socket 1[core 9[hwt 0]], > socket 1[core 10[hwt 0]], socket 1[core 11[hwt 0]]: > [./././././.][B/B/B/B/B/B] > >>> [csclprd3-0-7:20515] MCW rank 70 bound to socket 0[core 0[hwt 0-1]], > socket 0[core 1[hwt 0-1]], socket 0[core 2[hwt 0-1]], socket 0[core 3[hwt > 0-1]], socket 0[core 4[hwt 0-1]], socket 0[core 5[hwt 0-1]], socket 0[core > 6[hwt 0-1]], socket 0[core 7[hwt 0-1]]: > [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] > >>> [csclprd3-0-10:19096] MCW rank 102 bound to socket 0[core 0[hwt 0-1]], > socket 0[core 1[hwt 0-1]], socket 0[core 2[hwt 0-1]], socket 0[core 3[hwt > 0-1]], socket 0[core 4[hwt 0-1]], socket 0[core 5[hwt 0-1]], socket 0[core > 6[hwt 0-1]], socket 0[core 7[hwt 0-1]]: > [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] > >>> [csclprd3-0-11:31636] MCW rank 116 bound to socket 0[core 0[hwt 0-1]], > socket 0[core 1[hwt 0-1]], socket 0[core 2[hwt 0-1]], socket 0[core 3[hwt > 0-1]], socket 0[core 4[hwt 0-1]], socket 0[core 5[hwt 0-1]], socket 0[core > 6[hwt 0-1]], socket 0[core 7[hwt 0-1]]: > [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] > >>> [csclprd3-0-11:31636] MCW rank 117 bound to socket 1[core 8[hwt 0-1]], > socket 1[core 9[hwt 0-1]], socket 1[core 10[hwt 0-1]], socket 1[core 11[hwt > 0-1]], socket 1[core 12[hwt 0-1]], socket 1[core 13[hwt 0-1]], socket > 1[core 14[hwt 0-1]], socket 1[core 15[hwt 0-1]]: > [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] > >>> [csclprd3-0-0:31079] MCW rank 18 bound to socket 0[core 0[hwt 0]], > socket 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], socket 0[core 3[hwt 0]], > socket 0[core 4[hwt 0]], socket 0[core 5[hwt 0]]: [B/B/B/B/B/B][./././././.] > >>> [csclprd3-0-11:31636] MCW rank 118 bound to socket 0[core 0[hwt 0-1]], > socket 0[core 1[hwt 0-1]], socket 0[core 2[hwt 0-1]], socket 0[core 3[hwt > 0-1]], socket 0[core 4[hwt 0-1]], socket 0[core 5[hwt 0-1]], socket 0[core > 6[hwt 0-1]], socket 0[core 7[hwt 0-1]]: > [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] > >>> [csclprd3-0-0:31079] MCW rank 19 bound to socket 1[core 6[hwt 0]], > socket 1[core 7[hwt 0]], socket 1[core 8[hwt 0]], socket 1[core 9[hwt 0]], > socket 1[core 10[hwt 0]], socket 1[core 11[hwt 0]]: > [./././././.][B/B/B/B/B/B] > >>> [csclprd3-0-7:20515] MCW rank 71 bound to socket 1[core 8[hwt 0-1]], > socket 1[core 9[hwt 0-1]], socket 1[core 10[hwt 0-1]], socket 1[core 11[hwt > 0-1]], socket 1[core 12[hwt 0-1]], socket 1[core 13[hwt 0-1]], socket > 1[core 14[hwt 0-1]], socket 1[core 15[hwt 0-1]]: > [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] > >>> [csclprd3-0-10:19096] MCW rank 103 bound to socket 1[core 8[hwt 0-1]], > socket 1[core 9[hwt 0-1]], socket 1[core 10[hwt 0-1]], socket 1[core 11[hwt > 0-1]], socket 1[core 12[hwt 0-1]], socket 1[core 13[hwt 0-1]], socket > 1[core 14[hwt 0-1]], socket 1[core 15[hwt 0-1]]: > [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] > >>> [csclprd3-0-0:31079] MCW rank 8 bound to socket 0[core 0[hwt 0]], > socket 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], socket 0[core 3[hwt 0]], > socket 0[core 4[hwt 0]], socket 0[core 5[hwt 0]]: [B/B/B/B/B/B][./././././.] > >>> [csclprd3-0-0:31079] MCW rank 9 bound to socket 1[core 6[hwt 0]], > socket 1[core 7[hwt 0]], socket 1[core 8[hwt 0]], socket 1[core 9[hwt 0]], > socket 1[core 10[hwt 0]], socket 1[core 11[hwt 0]]: > [./././././.][B/B/B/B/B/B] > >>> [csclprd3-0-10:19096] MCW rank 88 bound to socket 0[core 0[hwt 0-1]], > socket 0[core 1[hwt 0-1]], socket 0[core 2[hwt 0-1]], socket 0[core 3[hwt > 0-1]], socket 0[core 4[hwt 0-1]], socket 0[core 5[hwt 0-1]], socket 0[core > 6[hwt 0-1]], socket 0[core 7[hwt 0-1]]: > [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] > >>> [csclprd3-0-11:31636] MCW rank 119 bound to socket 1[core 8[hwt 0-1]], > socket 1[core 9[hwt 0-1]], socket 1[core 10[hwt 0-1]], socket 1[core 11[hwt > 0-1]], socket 1[core 12[hwt 0-1]], socket 1[core 13[hwt 0-1]], socket > 1[core 14[hwt 0-1]], socket 1[core 15[hwt 0-1]]: > [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] > >>> [csclprd3-0-7:20515] MCW rank 56 bound to socket 0[core 0[hwt 0-1]], > socket 0[core 1[hwt 0-1]], socket 0[core 2[hwt 0-1]], socket 0[core 3[hwt > 0-1]], socket 0[core 4[hwt 0-1]], socket 0[core 5[hwt 0-1]], socket 0[core > 6[hwt 0-1]], socket 0[core 7[hwt 0-1]]: > [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] > >>> [csclprd3-0-0:31079] MCW rank 10 bound to socket 0[core 0[hwt 0]], > socket 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], socket 0[core 3[hwt 0]], > socket 0[core 4[hwt 0]], socket 0[core 5[hwt 0]]: [B/B/B/B/B/B][./././././.] > >>> [csclprd3-0-7:20515] MCW rank 57 bound to socket 1[core 8[hwt 0-1]], > socket 1[core 9[hwt 0-1]], socket 1[core 10[hwt 0-1]], socket 1[core 11[hwt > 0-1]], socket 1[core 12[hwt 0-1]], socket 1[core 13[hwt 0-1]], socket > 1[core 14[hwt 0-1]], socket 1[core 15[hwt 0-1]]: > [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] > >>> [csclprd3-0-10:19096] MCW rank 89 bound to socket 1[core 8[hwt 0-1]], > socket 1[core 9[hwt 0-1]], socket 1[core 10[hwt 0-1]], socket 1[core 11[hwt > 0-1]], socket 1[core 12[hwt 0-1]], socket 1[core 13[hwt 0-1]], socket > 1[core 14[hwt 0-1]], socket 1[core 15[hwt 0-1]]: > [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] > >>> [csclprd3-0-11:31636] MCW rank 104 bound to socket 0[core 0[hwt 0-1]], > socket 0[core 1[hwt 0-1]], socket 0[core 2[hwt 0-1]], socket 0[core 3[hwt > 0-1]], socket 0[core 4[hwt 0-1]], socket 0[core 5[hwt 0-1]], socket 0[core > 6[hwt 0-1]], socket 0[core 7[hwt 0-1]]: > [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] > >>> [csclprd3-0-0:31079] MCW rank 11 bound to socket 1[core 6[hwt 0]], > socket 1[core 7[hwt 0]], socket 1[core 8[hwt 0]], socket 1[core 9[hwt 0]], > socket 1[core 10[hwt 0]], socket 1[core 11[hwt 0]]: > [./././././.][B/B/B/B/B/B] > >>> [csclprd3-0-0:31079] MCW rank 12 bound to socket 0[core 0[hwt 0]], > socket 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], socket 0[core 3[hwt 0]], > socket 0[core 4[hwt 0]], socket 0[core 5[hwt 0]]: [B/B/B/B/B/B][./././././.] > >>> [csclprd3-0-4:30348] MCW rank 42 is not bound (or bound to all > >>> > >>> _______________________________________________ > >>> users mailing list > >>> us...@open-mpi.org > >>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > >>> Link to this post: > http://www.open-mpi.org/community/lists/users/2015/06/27185.php > >>> > >>> IMPORTANT WARNING: This message is intended for the use of the person > or entity to which it is addressed and may contain information that is > privileged and confidential, the disclosure of which is governed by > applicable law. If the reader of this message is not the intended > recipient, or the employee or agent responsible for delivering it to the > intended recipient, you are hereby notified that any dissemination, > distribution or copying of this information is strictly prohibited. Thank > you for your cooperation. > >>> _______________________________________________ > >>> users mailing list > >>> us...@open-mpi.org > >>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > >>> Link to this post: > http://www.open-mpi.org/community/lists/users/2015/06/27204.php > >> > >> IMPORTANT WARNING: This message is intended for the use of the person > or entity to which it is addressed and may contain information that is > privileged and confidential, the disclosure of which is governed by > applicable law. If the reader of this message is not the intended > recipient, or the employee or agent responsible for delivering it to the > intended recipient, you are hereby notified that any dissemination, > distribution or copying of this information is strictly prohibited. Thank > you for your cooperation. > >> _______________________________________________ > >> users mailing list > >> us...@open-mpi.org > >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > >> Link to this post: > http://www.open-mpi.org/community/lists/users/2015/06/27220.php > > > > > > -- > > Jeff Squyres > > jsquy...@cisco.com > > For corporate legal information go to: > http://www.cisco.com/web/about/doing_business/legal/cri/ > > > > _______________________________________________ > > users mailing list > > us...@open-mpi.org > > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > > Link to this post: > http://www.open-mpi.org/community/lists/users/2015/06/27222.php > > _______________________________________________ > users mailing list > us...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > Link to this post: > http://www.open-mpi.org/community/lists/users/2015/07/27261.php > IMPORTANT WARNING: This message is intended for the use of the person or > entity to which it is addressed and may contain information that is > privileged and confidential, the disclosure of which is governed by > applicable law. If the reader of this message is not the intended > recipient, or the employee or agent responsible for delivering it to the > intended recipient, you are hereby notified that any dissemination, > distribution or copying of this information is strictly prohibited. Thank > you for your cooperation. > > _______________________________________________ > users mailing list > us...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > Link to this post: > http://www.open-mpi.org/community/lists/users/2015/07/27263.php >