Would the output of dmidecode -t processor and/or lstopo tell me conclusively if hyperthreading is enabled or not? Hyperthreading is supposed to be enabled for all the IBM x3550 M3 and M4 nodes, but I'm not sure if it actually is and I don't have access to the BIOS settings.
-Bill L. ________________________________ From: users [users-boun...@open-mpi.org] on behalf of Ralph Castain [r...@open-mpi.org] Sent: Saturday, June 27, 2015 7:21 PM To: Open MPI Users Subject: Re: [OMPI users] OpenMPI 1.8.6, CentOS 6.3, too many slots = crash Bill - this is such a jumbled collection of machines that I’m having trouble figuring out what I should replicate. I can create something artificial here so I can try to debug this, but I need to know exactly what I’m up against - can you tell me: * the architecture of each type - how many sockets, how many cores/socket, HT on or off. If two nodes have the same physical setup but one has HT on and the other off, then please consider those two different types * how many nodes of each type Looking at your map output, it looks like the map is being done correctly, but somehow the binding locale isn’t getting set in some cases. You latest error output would seem out-of-step with your prior reports, so something else may be going on there. As I said earlier, this is the most hetero environment we’ve seen, and so there may be some code paths your hitting that haven’t been well exercised before. On Jun 26, 2015, at 5:22 PM, Lane, William <william.l...@cshs.org<mailto:william.l...@cshs.org>> wrote: Well, I managed to get a successful mpirun @ a slot count of 132 using --mca btl ^sm, however when I increased the slot count to 160, mpirun crashed without any error output: mpirun -np 160 -display-devel-map --prefix /hpc/apps/mpi/openmpi/1.8.6/ --hostfile hostfile-noslots --mca btl ^sm --hetero-nodes --bind-to core /hpc/home/lanew/mpi/openmpi/ProcessColors3 >> out.txt 2>&1 -------------------------------------------------------------------------- WARNING: a request was made to bind a process. While the system supports binding the process itself, at least one node does NOT support binding memory to the process location. Node: csclprd3-6-1 This usually is due to not having the required NUMA support installed on the node. In some Linux distributions, the required support is contained in the libnumactl and libnumactl-devel packages. This is a warning only; your job will continue, though performance may be degraded. -------------------------------------------------------------------------- -------------------------------------------------------------------------- A request was made to bind to that would result in binding more processes than cpus on a resource: Bind to: CORE Node: csclprd3-6-1 #processes: 2 #cpus: 1 You can override this protection by adding the "overload-allowed" option to your binding directive. -------------------------------------------------------------------------- But csclprd3-6-1 (a blade) does have 2 CPU's on 2 separate sockets w/2 cores apiece as shown in my dmidecode output: csclprd3-6-1 ~]# dmidecode -t processor # dmidecode 2.11 SMBIOS 2.4 present. Handle 0x0008, DMI type 4, 32 bytes Processor Information Socket Designation: Socket 1 CPU 1 Type: Central Processor Family: Xeon Manufacturer: GenuineIntel ID: F6 06 00 00 01 03 00 00 Signature: Type 0, Family 6, Model 15, Stepping 6 Flags: FPU (Floating-point unit on-chip) CX8 (CMPXCHG8 instruction supported) APIC (On-chip APIC hardware supported) Version: Intel Xeon Voltage: 2.9 V External Clock: 333 MHz Max Speed: 4000 MHz Current Speed: 3000 MHz Status: Populated, Enabled Upgrade: ZIF Socket L1 Cache Handle: 0x0004 L2 Cache Handle: 0x0005 L3 Cache Handle: Not Provided Handle 0x0009, DMI type 4, 32 bytes Processor Information Socket Designation: Socket 2 CPU 2 Type: Central Processor Family: Xeon Manufacturer: GenuineIntel ID: F6 06 00 00 01 03 00 00 Signature: Type 0, Family 6, Model 15, Stepping 6 Flags: FPU (Floating-point unit on-chip) CX8 (CMPXCHG8 instruction supported) APIC (On-chip APIC hardware supported) Version: Intel Xeon Voltage: 2.9 V External Clock: 333 MHz Max Speed: 4000 MHz Current Speed: 3000 MHz Status: Populated, Enabled Upgrade: ZIF Socket L1 Cache Handle: 0x0006 L2 Cache Handle: 0x0007 L3 Cache Handle: Not Provided csclprd3-6-1 ~]# lstopo Machine (16GB) Socket L#0 + L2 L#0 (4096KB) L1d L#0 (32KB) + L1i L#0 (32KB) + Core L#0 + PU L#0 (P#0) L1d L#1 (32KB) + L1i L#1 (32KB) + Core L#1 + PU L#1 (P#2) Socket L#1 + L2 L#1 (4096KB) L1d L#2 (32KB) + L1i L#2 (32KB) + Core L#2 + PU L#2 (P#1) L1d L#3 (32KB) + L1i L#3 (32KB) + Core L#3 + PU L#3 (P#3) csclprd3-0-1 information (which looks correct as this particular x3550 should have one socket populated (of two) with a 6 core Xeon (or 12 cores w/hyperthreading turned on): csclprd3-0-1 ~]# lstopo Machine (71GB) Socket L#0 + L3 L#0 (12MB) L2 L#0 (256KB) + L1d L#0 (32KB) + L1i L#0 (32KB) + Core L#0 + PU L#0 (P#0) L2 L#1 (256KB) + L1d L#1 (32KB) + L1i L#1 (32KB) + Core L#1 + PU L#1 (P#1) L2 L#2 (256KB) + L1d L#2 (32KB) + L1i L#2 (32KB) + Core L#2 + PU L#2 (P#2) L2 L#3 (256KB) + L1d L#3 (32KB) + L1i L#3 (32KB) + Core L#3 + PU L#3 (P#3) L2 L#4 (256KB) + L1d L#4 (32KB) + L1i L#4 (32KB) + Core L#4 + PU L#4 (P#4) L2 L#5 (256KB) + L1d L#5 (32KB) + L1i L#5 (32KB) + Core L#5 + PU L#5 (P#5) csclprd3-0-1 ~]# dmidecode -t processor # dmidecode 2.11 # SMBIOS entry point at 0x7f6be000 SMBIOS 2.5 present. Handle 0x0001, DMI type 4, 40 bytes Processor Information Socket Designation: Node 1 Socket 1 Type: Central Processor Family: Xeon MP Manufacturer: Intel(R) Corporation ID: C2 06 02 00 FF FB EB BF Signature: Type 0, Family 6, Model 44, Stepping 2 Flags: FPU (Floating-point unit on-chip) VME (Virtual mode extension) DE (Debugging extension) PSE (Page size extension) TSC (Time stamp counter) MSR (Model specific registers) PAE (Physical address extension) MCE (Machine check exception) CX8 (CMPXCHG8 instruction supported) APIC (On-chip APIC hardware supported) SEP (Fast system call) MTRR (Memory type range registers) PGE (Page global enable) MCA (Machine check architecture) CMOV (Conditional move instruction supported) PAT (Page attribute table) PSE-36 (36-bit page size extension) CLFSH (CLFLUSH instruction supported) DS (Debug store) ACPI (ACPI supported) MMX (MMX technology supported) FXSR (FXSAVE and FXSTOR instructions supported) SSE (Streaming SIMD extensions) SSE2 (Streaming SIMD extensions 2) SS (Self-snoop) HTT (Multi-threading) TM (Thermal monitor supported) PBE (Pending break enabled) Version: Intel(R) Xeon(R) CPU E5645 @ 2.40GHz Voltage: 1.2 V External Clock: 5866 MHz Max Speed: 4400 MHz Current Speed: 2400 MHz Status: Populated, Enabled Upgrade: ZIF Socket L1 Cache Handle: 0x0002 L2 Cache Handle: 0x0003 L3 Cache Handle: 0x0004 Serial Number: Not Specified Asset Tag: Not Specified Part Number: Not Specified Core Count: 6 Core Enabled: 6 Thread Count: 6 Characteristics: 64-bit capable Handle 0x005A, DMI type 4, 40 bytes Processor Information Socket Designation: Node 1 Socket 2 Type: Central Processor Family: Xeon MP Manufacturer: Not Specified ID: 00 00 00 00 00 00 00 00 Signature: Type 0, Family 0, Model 0, Stepping 0 Flags: None Version: Not Specified Voltage: 1.2 V External Clock: 5866 MHz Max Speed: 4400 MHz Current Speed: Unknown Status: Unpopulated Upgrade: ZIF Socket L1 Cache Handle: Not Provided L2 Cache Handle: Not Provided L3 Cache Handle: Not Provided Serial Number: Not Specified Asset Tag: Not Specified Part Number: Not Specified Characteristics: None ________________________________ From: users [users-boun...@open-mpi.org<mailto:users-boun...@open-mpi.org>] on behalf of Ralph Castain [r...@open-mpi.org<mailto:r...@open-mpi.org>] Sent: Wednesday, June 24, 2015 6:06 AM To: Open MPI Users Subject: Re: [OMPI users] OpenMPI 1.8.6, CentOS 6.3, too many slots = crash I think trying with --mca btl ^sm makes a lot of sense and may solve the problem. I also noted that we are having trouble with the topology of several of the nodes - seeing only one socket, non-HT where you say we should see two sockets and HT-enabled. In those cases, the locality is "unknown" - given that those procs are on remote nodes from the one being impacted, I don't think it should cause a problem. However, it isn't correct, and that raises flags. My best guess of the root cause of that error is either we are getting bad topology info on those nodes, or we have a bug that is mishandling this scenario. It would probably be good to get this error fixed to ensure it isn't the source of the eventual crash, even though I'm not sure they are related. Bill: Can we examine one of the problem nodes? Let's pick csclprd3-0-1 (or take another one from your list - just look for one where "locality" is reported as "unknown" for the procs in the output map). Can you run lstopo on that node and send us the output? In the above map, it is reporting a single socket with 6 cores, non-HT. Is that what lstopo shows when run on the node? Is it what you expected? On Wed, Jun 24, 2015 at 4:07 AM, Gilles Gouaillardet <gilles.gouaillar...@gmail.com<mailto:gilles.gouaillar...@gmail.com>> wrote: Bill, were you able to get a core file and analyze the stack with gdb ? I suspect the error occurs in mca_btl_sm_add_procs but this is just my best guess. if this is correct, can you check the value of mca_btl_sm_component.num_smp_procs ? as a workaround, can you try mpirun --mca btl ^sm ... I do not see how I can tackle the root cause without being able to reproduce the issue :-( can you try to reproduce the issue with the smallest hostfile, and then run lstopo on all the nodes ? btw, you are not mixing 32 bits and 64 bits OS, are you ? Cheers, Gilles mca_btl_sm_add_procs( int mca_btl_sm_add_procs( On Wednesday, June 24, 2015, Lane, William <william.l...@cshs.org<mailto:william.l...@cshs.org>> wrote: Gilles, All the blades only have two core Xeons (without hyperthreading) populating both their sockets. All the x3550 nodes have hyperthreading capable Xeons and Sandybridge server CPU's. It's possible hyperthreading has been disabled on some of these nodes though. The 3-0-n nodes are all IBM x3550 nodes while the 3-6-n nodes are all blade nodes. I have run this exact same test code successfully in the past on another cluster (~200 nodes of Sunfire X2100 2x dual-core Opterons) w/no issues on upwards of 390 slots. I even tested it recently on OpenMPI 1.8.5 on another smaller R&D cluster consisting of 10 Sunfire X2100 nodes (w/2 dual core Opterons apiece). On this particular cluster I've had success running this code on < 132 slots. Anyway, here's the results of the following mpirun: mpirun -np 132 -display-devel-map --prefix /hpc/apps/mpi/openmpi/1.8.6/ --hostfile hostfile-noslots --mca btl_tcp_if_include eth0 --hetero-nodes --bind-to core /hpc/home/lanew/mpi/openmpi/ProcessColors3 >> out.txt 2>&1 -------------------------------------------------------------------------- WARNING: a request was made to bind a process. While the system supports binding the process itself, at least one node does NOT support binding memory to the process location. Node: csclprd3-6-1 This usually is due to not having the required NUMA support installed on the node. In some Linux distributions, the required support is contained in the libnumactl and libnumactl-devel packages. This is a warning only; your job will continue, though performance may be degraded. -------------------------------------------------------------------------- Data for JOB [51718,1] offset 0 Mapper requested: NULL Last mapper: round_robin Mapping policy: BYSOCKET Ranking policy: SLOT Binding policy: CORE Cpu set: NULL PPR: NULL Cpus-per-rank: 1 Num new daemons: 0 New daemon starting vpid INVALID Num nodes: 15 Data for node: csclprd3-6-1 Launch id: -1 State: 0 Daemon: [[51718,0],1] Daemon launched: True Num slots: 4 Slots in use: 4 Oversubscribed: FALSE Num slots allocated: 4 Max slots: 0 Username on node: NULL Num procs: 4 Next node_rank: 4 Data for proc: [[51718,1],0] Pid: 0 Local rank: 0 Node rank: 0 App rank: 0 State: INITIALIZED App_context: 0 Locale: [B/B][./.] Binding: [B/.][./.] Data for proc: [[51718,1],1] Pid: 0 Local rank: 1 Node rank: 1 App rank: 1 State: INITIALIZED App_context: 0 Locale: [./.][B/B] Binding: [./.][B/.] Data for proc: [[51718,1],2] Pid: 0 Local rank: 2 Node rank: 2 App rank: 2 State: INITIALIZED App_context: 0 Locale: [B/B][./.] Binding: [./B][./.] Data for proc: [[51718,1],3] Pid: 0 Local rank: 3 Node rank: 3 App rank: 3 State: INITIALIZED App_context: 0 Locale: [./.][B/B] Binding: [./.][./B] Data for node: csclprd3-6-5 Launch id: -1 State: 0 Daemon: [[51718,0],2] Daemon launched: True Num slots: 4 Slots in use: 4 Oversubscribed: FALSE Num slots allocated: 4 Max slots: 0 Username on node: NULL Num procs: 4 Next node_rank: 4 Data for proc: [[51718,1],4] Pid: 0 Local rank: 0 Node rank: 0 App rank: 4 State: INITIALIZED App_context: 0 Locale: [B/B][./.] Binding: [B/.][./.] Data for proc: [[51718,1],5] Pid: 0 Local rank: 1 Node rank: 1 App rank: 5 State: INITIALIZED App_context: 0 Locale: [./.][B/B] Binding: [./.][B/.] Data for proc: [[51718,1],6] Pid: 0 Local rank: 2 Node rank: 2 App rank: 6 State: INITIALIZED App_context: 0 Locale: [B/B][./.] Binding: [./B][./.] Data for proc: [[51718,1],7] Pid: 0 Local rank: 3 Node rank: 3 App rank: 7 State: INITIALIZED App_context: 0 Locale: [./.][B/B] Binding: [./.][./B] Data for node: csclprd3-0-0 Launch id: -1 State: 0 Daemon: [[51718,0],3] Daemon launched: True Num slots: 12 Slots in use: 12 Oversubscribed: FALSE Num slots allocated: 12 Max slots: 0 Username on node: NULL Num procs: 12 Next node_rank: 12 Data for proc: [[51718,1],8] Pid: 0 Local rank: 0 Node rank: 0 App rank: 8 State: INITIALIZED App_context: 0 Locale: [B/B/B/B/B/B][./././././.] Binding: [B/././././.][./././././.] Data for proc: [[51718,1],9] Pid: 0 Local rank: 1 Node rank: 1 App rank: 9 State: INITIALIZED App_context: 0 Locale: [./././././.][B/B/B/B/B/B] Binding: [./././././.][B/././././.] Data for proc: [[51718,1],10] Pid: 0 Local rank: 2 Node rank: 2 App rank: 10 State: INITIALIZED App_context: 0 Locale: [B/B/B/B/B/B][./././././.] Binding: [./B/./././.][./././././.] Data for proc: [[51718,1],11] Pid: 0 Local rank: 3 Node rank: 3 App rank: 11 State: INITIALIZED App_context: 0 Locale: [./././././.][B/B/B/B/B/B] Binding: [./././././.][./B/./././.] Data for proc: [[51718,1],12] Pid: 0 Local rank: 4 Node rank: 4 App rank: 12 State: INITIALIZED App_context: 0 Locale: [B/B/B/B/B/B][./././././.] Binding: [././B/././.][./././././.] Data for proc: [[51718,1],13] Pid: 0 Local rank: 5 Node rank: 5 App rank: 13 State: INITIALIZED App_context: 0 Locale: [./././././.][B/B/B/B/B/B] Binding: [./././././.][././B/././.] Data for proc: [[51718,1],14] Pid: 0 Local rank: 6 Node rank: 6 App rank: 14 State: INITIALIZED App_context: 0 Locale: [B/B/B/B/B/B][./././././.] Binding: [./././B/./.][./././././.] Data for proc: [[51718,1],15] Pid: 0 Local rank: 7 Node rank: 7 App rank: 15 State: INITIALIZED App_context: 0 Locale: [./././././.][B/B/B/B/B/B] Binding: [./././././.][./././B/./.] Data for proc: [[51718,1],16] Pid: 0 Local rank: 8 Node rank: 8 App rank: 16 State: INITIALIZED App_context: 0 Locale: [B/B/B/B/B/B][./././././.] Binding: [././././B/.][./././././.] Data for proc: [[51718,1],17] Pid: 0 Local rank: 9 Node rank: 9 App rank: 17 State: INITIALIZED App_context: 0 Locale: [./././././.][B/B/B/B/B/B] Binding: [./././././.][././././B/.] Data for proc: [[51718,1],18] Pid: 0 Local rank: 10 Node rank: 10 App rank: 18 State: INITIALIZED App_context: 0 Locale: [B/B/B/B/B/B][./././././.] Binding: [./././././B][./././././.] Data for proc: [[51718,1],19] Pid: 0 Local rank: 11 Node rank: 11 App rank: 19 State: INITIALIZED App_context: 0 Locale: [./././././.][B/B/B/B/B/B] Binding: [./././././.][./././././B] Data for node: csclprd3-0-1 Launch id: -1 State: 0 Daemon: [[51718,0],4] Daemon launched: True Num slots: 6 Slots in use: 6 Oversubscribed: FALSE Num slots allocated: 6 Max slots: 0 Username on node: NULL Num procs: 6 Next node_rank: 6 Data for proc: [[51718,1],20] Pid: 0 Local rank: 0 Node rank: 0 App rank: 20 State: INITIALIZED App_context: 0 Locale: UNKNOWN Binding: [B/././././.] Data for proc: [[51718,1],21] Pid: 0 Local rank: 1 Node rank: 1 App rank: 21 State: INITIALIZED App_context: 0 Locale: UNKNOWN Binding: [./B/./././.] Data for proc: [[51718,1],22] Pid: 0 Local rank: 2 Node rank: 2 App rank: 22 State: INITIALIZED App_context: 0 Locale: UNKNOWN Binding: [././B/././.] Data for proc: [[51718,1],23] Pid: 0 Local rank: 3 Node rank: 3 App rank: 23 State: INITIALIZED App_context: 0 Locale: UNKNOWN Binding: [./././B/./.] Data for proc: [[51718,1],24] Pid: 0 Local rank: 4 Node rank: 4 App rank: 24 State: INITIALIZED App_context: 0 Locale: UNKNOWN Binding: [././././B/.] Data for proc: [[51718,1],25] Pid: 0 Local rank: 5 Node rank: 5 App rank: 25 State: INITIALIZED App_context: 0 Locale: UNKNOWN Binding: [./././././B] Data for node: csclprd3-0-2 Launch id: -1 State: 0 Daemon: [[51718,0],5] Daemon launched: True Num slots: 6 Slots in use: 6 Oversubscribed: FALSE Num slots allocated: 6 Max slots: 0 Username on node: NULL Num procs: 6 Next node_rank: 6 Data for proc: [[51718,1],26] Pid: 0 Local rank: 0 Node rank: 0 App rank: 26 State: INITIALIZED App_context: 0 Locale: UNKNOWN Binding: [B/././././.] Data for proc: [[51718,1],27] Pid: 0 Local rank: 1 Node rank: 1 App rank: 27 State: INITIALIZED App_context: 0 Locale: UNKNOWN Binding: [./B/./././.] Data for proc: [[51718,1],28] Pid: 0 Local rank: 2 Node rank: 2 App rank: 28 State: INITIALIZED App_context: 0 Locale: UNKNOWN Binding: [././B/././.] Data for proc: [[51718,1],29] Pid: 0 Local rank: 3 Node rank: 3 App rank: 29 State: INITIALIZED App_context: 0 Locale: UNKNOWN Binding: [./././B/./.] Data for proc: [[51718,1],30] Pid: 0 Local rank: 4 Node rank: 4 App rank: 30 State: INITIALIZED App_context: 0 Locale: UNKNOWN Binding: [././././B/.] Data for proc: [[51718,1],31] Pid: 0 Local rank: 5 Node rank: 5 App rank: 31 State: INITIALIZED App_context: 0 Locale: UNKNOWN Binding: [./././././B] Data for node: csclprd3-0-3 Launch id: -1 State: 0 Daemon: [[51718,0],6] Daemon launched: True Num slots: 6 Slots in use: 6 Oversubscribed: FALSE Num slots allocated: 6 Max slots: 0 Username on node: NULL Num procs: 6 Next node_rank: 6 Data for proc: [[51718,1],32] Pid: 0 Local rank: 0 Node rank: 0 App rank: 32 State: INITIALIZED App_context: 0 Locale: UNKNOWN Binding: [B/././././.] Data for proc: [[51718,1],33] Pid: 0 Local rank: 1 Node rank: 1 App rank: 33 State: INITIALIZED App_context: 0 Locale: UNKNOWN Binding: [./B/./././.] Data for proc: [[51718,1],34] Pid: 0 Local rank: 2 Node rank: 2 App rank: 34 State: INITIALIZED App_context: 0 Locale: UNKNOWN Binding: [././B/././.] Data for proc: [[51718,1],35] Pid: 0 Local rank: 3 Node rank: 3 App rank: 35 State: INITIALIZED App_context: 0 Locale: UNKNOWN Binding: [./././B/./.] Data for proc: [[51718,1],36] Pid: 0 Local rank: 4 Node rank: 4 App rank: 36 State: INITIALIZED App_context: 0 Locale: UNKNOWN Binding: [././././B/.] Data for proc: [[51718,1],37] Pid: 0 Local rank: 5 Node rank: 5 App rank: 37 State: INITIALIZED App_context: 0 Locale: UNKNOWN Binding: [./././././B] Data for node: csclprd3-0-4 Launch id: -1 State: 0 Daemon: [[51718,0],7] Daemon launched: True Num slots: 6 Slots in use: 6 Oversubscribed: FALSE Num slots allocated: 6 Max slots: 0 Username on node: NULL Num procs: 6 Next node_rank: 6 Data for proc: [[51718,1],38] Pid: 0 Local rank: 0 Node rank: 0 App rank: 38 State: INITIALIZED App_context: 0 Locale: UNKNOWN Binding: [B/././././.] Data for proc: [[51718,1],39] Pid: 0 Local rank: 1 Node rank: 1 App rank: 39 State: INITIALIZED App_context: 0 Locale: UNKNOWN Binding: [./B/./././.] Data for proc: [[51718,1],40] Pid: 0 Local rank: 2 Node rank: 2 App rank: 40 State: INITIALIZED App_context: 0 Locale: UNKNOWN Binding: [././B/././.] Data for proc: [[51718,1],41] Pid: 0 Local rank: 3 Node rank: 3 App rank: 41 State: INITIALIZED App_context: 0 Locale: UNKNOWN Binding: [./././B/./.] Data for proc: [[51718,1],42] Pid: 0 Local rank: 4 Node rank: 4 App rank: 42 State: INITIALIZED App_context: 0 Locale: UNKNOWN Binding: [././././B/.] Data for proc: [[51718,1],43] Pid: 0 Local rank: 5 Node rank: 5 App rank: 43 State: INITIALIZED App_context: 0 Locale: UNKNOWN Binding: [./././././B] Data for node: csclprd3-0-5 Launch id: -1 State: 0 Daemon: [[51718,0],8] Daemon launched: True Num slots: 6 Slots in use: 6 Oversubscribed: FALSE Num slots allocated: 6 Max slots: 0 Username on node: NULL Num procs: 6 Next node_rank: 6 Data for proc: [[51718,1],44] Pid: 0 Local rank: 0 Node rank: 0 App rank: 44 State: INITIALIZED App_context: 0 Locale: UNKNOWN Binding: [B/././././.] Data for proc: [[51718,1],45] Pid: 0 Local rank: 1 Node rank: 1 App rank: 45 State: INITIALIZED App_context: 0 Locale: UNKNOWN Binding: [./B/./././.] Data for proc: [[51718,1],46] Pid: 0 Local rank: 2 Node rank: 2 App rank: 46 State: INITIALIZED App_context: 0 Locale: UNKNOWN Binding: [././B/././.] Data for proc: [[51718,1],47] Pid: 0 Local rank: 3 Node rank: 3 App rank: 47 State: INITIALIZED App_context: 0 Locale: UNKNOWN Binding: [./././B/./.] Data for proc: [[51718,1],48] Pid: 0 Local rank: 4 Node rank: 4 App rank: 48 State: INITIALIZED App_context: 0 Locale: UNKNOWN Binding: [././././B/.] Data for proc: [[51718,1],49] Pid: 0 Local rank: 5 Node rank: 5 App rank: 49 State: INITIALIZED App_context: 0 Locale: UNKNOWN Binding: [./././././B] Data for node: csclprd3-0-6 Launch id: -1 State: 0 Daemon: [[51718,0],9] Daemon launched: True Num slots: 6 Slots in use: 6 Oversubscribed: FALSE Num slots allocated: 6 Max slots: 0 Username on node: NULL Num procs: 6 Next node_rank: 6 Data for proc: [[51718,1],50] Pid: 0 Local rank: 0 Node rank: 0 App rank: 50 State: INITIALIZED App_context: 0 Locale: UNKNOWN Binding: [B/././././.] Data for proc: [[51718,1],51] Pid: 0 Local rank: 1 Node rank: 1 App rank: 51 State: INITIALIZED App_context: 0 Locale: UNKNOWN Binding: [./B/./././.] Data for proc: [[51718,1],52] Pid: 0 Local rank: 2 Node rank: 2 App rank: 52 State: INITIALIZED App_context: 0 Locale: UNKNOWN Binding: [././B/././.] Data for proc: [[51718,1],53] Pid: 0 Local rank: 3 Node rank: 3 App rank: 53 State: INITIALIZED App_context: 0 Locale: UNKNOWN Binding: [./././B/./.] Data for proc: [[51718,1],54] Pid: 0 Local rank: 4 Node rank: 4 App rank: 54 State: INITIALIZED App_context: 0 Locale: UNKNOWN Binding: [././././B/.] Data for proc: [[51718,1],55] Pid: 0 Local rank: 5 Node rank: 5 App rank: 55 State: INITIALIZED App_context: 0 Locale: UNKNOWN Binding: [./././././B] Data for node: csclprd3-0-7 Launch id: -1 State: 0 Daemon: [[51718,0],10] Daemon launched: True Num slots: 16 Slots in use: 16 Oversubscribed: FALSE Num slots allocated: 16 Max slots: 0 Username on node: NULL Num procs: 16 Next node_rank: 16 Data for proc: [[51718,1],56] Pid: 0 Local rank: 0 Node rank: 0 App rank: 56 State: INITIALIZED App_context: 0 Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] Binding: [BB/../../../../../../..][../../../../../../../..] Data for proc: [[51718,1],57] Pid: 0 Local rank: 1 Node rank: 1 App rank: 57 State: INITIALIZED App_context: 0 Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] Binding: [../../../../../../../..][BB/../../../../../../..] Data for proc: [[51718,1],58] Pid: 0 Local rank: 2 Node rank: 2 App rank: 58 State: INITIALIZED App_context: 0 Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] Binding: [../BB/../../../../../..][../../../../../../../..] Data for proc: [[51718,1],59] Pid: 0 Local rank: 3 Node rank: 3 App rank: 59 State: INITIALIZED App_context: 0 Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] Binding: [../../../../../../../..][../BB/../../../../../..] Data for proc: [[51718,1],60] Pid: 0 Local rank: 4 Node rank: 4 App rank: 60 State: INITIALIZED App_context: 0 Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] Binding: [../../BB/../../../../..][../../../../../../../..] Data for proc: [[51718,1],61] Pid: 0 Local rank: 5 Node rank: 5 App rank: 61 State: INITIALIZED App_context: 0 Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] Binding: [../../../../../../../..][../../BB/../../../../..] Data for proc: [[51718,1],62] Pid: 0 Local rank: 6 Node rank: 6 App rank: 62 State: INITIALIZED App_context: 0 Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] Binding: [../../../BB/../../../..][../../../../../../../..] Data for proc: [[51718,1],63] Pid: 0 Local rank: 7 Node rank: 7 App rank: 63 State: INITIALIZED App_context: 0 Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] Binding: [../../../../../../../..][../../../BB/../../../..] Data for proc: [[51718,1],64] Pid: 0 Local rank: 8 Node rank: 8 App rank: 64 State: INITIALIZED App_context: 0 Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] Binding: [../../../../BB/../../..][../../../../../../../..] Data for proc: [[51718,1],65] Pid: 0 Local rank: 9 Node rank: 9 App rank: 65 State: INITIALIZED App_context: 0 Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] Binding: [../../../../../../../..][../../../../BB/../../..] Data for proc: [[51718,1],66] Pid: 0 Local rank: 10 Node rank: 10 App rank: 66 State: INITIALIZED App_context: 0 Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] Binding: [../../../../../BB/../..][../../../../../../../..] Data for proc: [[51718,1],67] Pid: 0 Local rank: 11 Node rank: 11 App rank: 67 State: INITIALIZED App_context: 0 Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] Binding: [../../../../../../../..][../../../../../BB/../..] Data for proc: [[51718,1],68] Pid: 0 Local rank: 12 Node rank: 12 App rank: 68 State: INITIALIZED App_context: 0 Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] Binding: [../../../../../../BB/..][../../../../../../../..] Data for proc: [[51718,1],69] Pid: 0 Local rank: 13 Node rank: 13 App rank: 69 State: INITIALIZED App_context: 0 Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] Binding: [../../../../../../../..][../../../../../../BB/..] Data for proc: [[51718,1],70] Pid: 0 Local rank: 14 Node rank: 14 App rank: 70 State: INITIALIZED App_context: 0 Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] Binding: [../../../../../../../BB][../../../../../../../..] Data for proc: [[51718,1],71] Pid: 0 Local rank: 15 Node rank: 15 App rank: 71 State: INITIALIZED App_context: 0 Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] Binding: [../../../../../../../..][../../../../../../../BB] Data for node: csclprd3-0-8 Launch id: -1 State: 0 Daemon: [[51718,0],11] Daemon launched: True Num slots: 16 Slots in use: 16 Oversubscribed: FALSE Num slots allocated: 16 Max slots: 0 Username on node: NULL Num procs: 16 Next node_rank: 16 Data for proc: [[51718,1],72] Pid: 0 Local rank: 0 Node rank: 0 App rank: 72 State: INITIALIZED App_context: 0 Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] Binding: [BB/../../../../../../..][../../../../../../../..] Data for proc: [[51718,1],73] Pid: 0 Local rank: 1 Node rank: 1 App rank: 73 State: INITIALIZED App_context: 0 Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] Binding: [../../../../../../../..][BB/../../../../../../..] Data for proc: [[51718,1],74] Pid: 0 Local rank: 2 Node rank: 2 App rank: 74 State: INITIALIZED App_context: 0 Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] Binding: [../BB/../../../../../..][../../../../../../../..] Data for proc: [[51718,1],75] Pid: 0 Local rank: 3 Node rank: 3 App rank: 75 State: INITIALIZED App_context: 0 Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] Binding: [../../../../../../../..][../BB/../../../../../..] Data for proc: [[51718,1],76] Pid: 0 Local rank: 4 Node rank: 4 App rank: 76 State: INITIALIZED App_context: 0 Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] Binding: [../../BB/../../../../..][../../../../../../../..] Data for proc: [[51718,1],77] Pid: 0 Local rank: 5 Node rank: 5 App rank: 77 State: INITIALIZED App_context: 0 Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] Binding: [../../../../../../../..][../../BB/../../../../..] Data for proc: [[51718,1],78] Pid: 0 Local rank: 6 Node rank: 6 App rank: 78 State: INITIALIZED App_context: 0 Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] Binding: [../../../BB/../../../..][../../../../../../../..] Data for proc: [[51718,1],79] Pid: 0 Local rank: 7 Node rank: 7 App rank: 79 State: INITIALIZED App_context: 0 Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] Binding: [../../../../../../../..][../../../BB/../../../..] Data for proc: [[51718,1],80] Pid: 0 Local rank: 8 Node rank: 8 App rank: 80 State: INITIALIZED App_context: 0 Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] Binding: [../../../../BB/../../..][../../../../../../../..] Data for proc: [[51718,1],81] Pid: 0 Local rank: 9 Node rank: 9 App rank: 81 State: INITIALIZED App_context: 0 Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] Binding: [../../../../../../../..][../../../../BB/../../..] Data for proc: [[51718,1],82] Pid: 0 Local rank: 10 Node rank: 10 App rank: 82 State: INITIALIZED App_context: 0 Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] Binding: [../../../../../BB/../..][../../../../../../../..] Data for proc: [[51718,1],83] Pid: 0 Local rank: 11 Node rank: 11 App rank: 83 State: INITIALIZED App_context: 0 Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] Binding: [../../../../../../../..][../../../../../BB/../..] Data for proc: [[51718,1],84] Pid: 0 Local rank: 12 Node rank: 12 App rank: 84 State: INITIALIZED App_context: 0 Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] Binding: [../../../../../../BB/..][../../../../../../../..] Data for proc: [[51718,1],85] Pid: 0 Local rank: 13 Node rank: 13 App rank: 85 State: INITIALIZED App_context: 0 Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] Binding: [../../../../../../../..][../../../../../../BB/..] Data for proc: [[51718,1],86] Pid: 0 Local rank: 14 Node rank: 14 App rank: 86 State: INITIALIZED App_context: 0 Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] Binding: [../../../../../../../BB][../../../../../../../..] Data for proc: [[51718,1],87] Pid: 0 Local rank: 15 Node rank: 15 App rank: 87 State: INITIALIZED App_context: 0 Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] Binding: [../../../../../../../..][../../../../../../../BB] Data for node: csclprd3-0-10 Launch id: -1 State: 0 Daemon: [[51718,0],12] Daemon launched: True Num slots: 16 Slots in use: 16 Oversubscribed: FALSE Num slots allocated: 16 Max slots: 0 Username on node: NULL Num procs: 16 Next node_rank: 16 Data for proc: [[51718,1],88] Pid: 0 Local rank: 0 Node rank: 0 App rank: 88 State: INITIALIZED App_context: 0 Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] Binding: [BB/../../../../../../..][../../../../../../../..] Data for proc: [[51718,1],89] Pid: 0 Local rank: 1 Node rank: 1 App rank: 89 State: INITIALIZED App_context: 0 Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] Binding: [../../../../../../../..][BB/../../../../../../..] Data for proc: [[51718,1],90] Pid: 0 Local rank: 2 Node rank: 2 App rank: 90 State: INITIALIZED App_context: 0 Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] Binding: [../BB/../../../../../..][../../../../../../../..] Data for proc: [[51718,1],91] Pid: 0 Local rank: 3 Node rank: 3 App rank: 91 State: INITIALIZED App_context: 0 Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] Binding: [../../../../../../../..][../BB/../../../../../..] Data for proc: [[51718,1],92] Pid: 0 Local rank: 4 Node rank: 4 App rank: 92 State: INITIALIZED App_context: 0 Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] Binding: [../../BB/../../../../..][../../../../../../../..] Data for proc: [[51718,1],93] Pid: 0 Local rank: 5 Node rank: 5 App rank: 93 State: INITIALIZED App_context: 0 Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] Binding: [../../../../../../../..][../../BB/../../../../..] Data for proc: [[51718,1],94] Pid: 0 Local rank: 6 Node rank: 6 App rank: 94 State: INITIALIZED App_context: 0 Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] Binding: [../../../BB/../../../..][../../../../../../../..] Data for proc: [[51718,1],95] Pid: 0 Local rank: 7 Node rank: 7 App rank: 95 State: INITIALIZED App_context: 0 Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] Binding: [../../../../../../../..][../../../BB/../../../..] Data for proc: [[51718,1],96] Pid: 0 Local rank: 8 Node rank: 8 App rank: 96 State: INITIALIZED App_context: 0 Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] Binding: [../../../../BB/../../..][../../../../../../../..] Data for proc: [[51718,1],97] Pid: 0 Local rank: 9 Node rank: 9 App rank: 97 State: INITIALIZED App_context: 0 Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] Binding: [../../../../../../../..][../../../../BB/../../..] Data for proc: [[51718,1],98] Pid: 0 Local rank: 10 Node rank: 10 App rank: 98 State: INITIALIZED App_context: 0 Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] Binding: [../../../../../BB/../..][../../../../../../../..] Data for proc: [[51718,1],99] Pid: 0 Local rank: 11 Node rank: 11 App rank: 99 State: INITIALIZED App_context: 0 Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] Binding: [../../../../../../../..][../../../../../BB/../..] Data for proc: [[51718,1],100] Pid: 0 Local rank: 12 Node rank: 12 App rank: 100 State: INITIALIZED App_context: 0 Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] Binding: [../../../../../../BB/..][../../../../../../../..] Data for proc: [[51718,1],101] Pid: 0 Local rank: 13 Node rank: 13 App rank: 101 State: INITIALIZED App_context: 0 Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] Binding: [../../../../../../../..][../../../../../../BB/..] Data for proc: [[51718,1],102] Pid: 0 Local rank: 14 Node rank: 14 App rank: 102 State: INITIALIZED App_context: 0 Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] Binding: [../../../../../../../BB][../../../../../../../..] Data for proc: [[51718,1],103] Pid: 0 Local rank: 15 Node rank: 15 App rank: 103 State: INITIALIZED App_context: 0 Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] Binding: [../../../../../../../..][../../../../../../../BB] Data for node: csclprd3-0-11 Launch id: -1 State: 0 Daemon: [[51718,0],13] Daemon launched: True Num slots: 16 Slots in use: 16 Oversubscribed: FALSE Num slots allocated: 16 Max slots: 0 Username on node: NULL Num procs: 16 Next node_rank: 16 Data for proc: [[51718,1],104] Pid: 0 Local rank: 0 Node rank: 0 App rank: 104 State: INITIALIZED App_context: 0 Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] Binding: [BB/../../../../../../..][../../../../../../../..] Data for proc: [[51718,1],105] Pid: 0 Local rank: 1 Node rank: 1 App rank: 105 State: INITIALIZED App_context: 0 Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] Binding: [../../../../../../../..][BB/../../../../../../..] Data for proc: [[51718,1],106] Pid: 0 Local rank: 2 Node rank: 2 App rank: 106 State: INITIALIZED App_context: 0 Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] Binding: [../BB/../../../../../..][../../../../../../../..] Data for proc: [[51718,1],107] Pid: 0 Local rank: 3 Node rank: 3 App rank: 107 State: INITIALIZED App_context: 0 Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] Binding: [../../../../../../../..][../BB/../../../../../..] Data for proc: [[51718,1],108] Pid: 0 Local rank: 4 Node rank: 4 App rank: 108 State: INITIALIZED App_context: 0 Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] Binding: [../../BB/../../../../..][../../../../../../../..] Data for proc: [[51718,1],109] Pid: 0 Local rank: 5 Node rank: 5 App rank: 109 State: INITIALIZED App_context: 0 Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] Binding: [../../../../../../../..][../../BB/../../../../..] Data for proc: [[51718,1],110] Pid: 0 Local rank: 6 Node rank: 6 App rank: 110 State: INITIALIZED App_context: 0 Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] Binding: [../../../BB/../../../..][../../../../../../../..] Data for proc: [[51718,1],111] Pid: 0 Local rank: 7 Node rank: 7 App rank: 111 State: INITIALIZED App_context: 0 Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] Binding: [../../../../../../../..][../../../BB/../../../..] Data for proc: [[51718,1],112] Pid: 0 Local rank: 8 Node rank: 8 App rank: 112 State: INITIALIZED App_context: 0 Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] Binding: [../../../../BB/../../..][../../../../../../../..] Data for proc: [[51718,1],113] Pid: 0 Local rank: 9 Node rank: 9 App rank: 113 State: INITIALIZED App_context: 0 Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] Binding: [../../../../../../../..][../../../../BB/../../..] Data for proc: [[51718,1],114] Pid: 0 Local rank: 10 Node rank: 10 App rank: 114 State: INITIALIZED App_context: 0 Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] Binding: [../../../../../BB/../..][../../../../../../../..] Data for proc: [[51718,1],115] Pid: 0 Local rank: 11 Node rank: 11 App rank: 115 State: INITIALIZED App_context: 0 Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] Binding: [../../../../../../../..][../../../../../BB/../..] Data for proc: [[51718,1],116] Pid: 0 Local rank: 12 Node rank: 12 App rank: 116 State: INITIALIZED App_context: 0 Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] Binding: [../../../../../../BB/..][../../../../../../../..] Data for proc: [[51718,1],117] Pid: 0 Local rank: 13 Node rank: 13 App rank: 117 State: INITIALIZED App_context: 0 Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] Binding: [../../../../../../../..][../../../../../../BB/..] Data for proc: [[51718,1],118] Pid: 0 Local rank: 14 Node rank: 14 App rank: 118 State: INITIALIZED App_context: 0 Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] Binding: [../../../../../../../BB][../../../../../../../..] Data for proc: [[51718,1],119] Pid: 0 Local rank: 15 Node rank: 15 App rank: 119 State: INITIALIZED App_context: 0 Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] Binding: [../../../../../../../..][../../../../../../../BB] Data for node: csclprd3-0-12 Launch id: -1 State: 0 Daemon: [[51718,0],14] Daemon launched: True Num slots: 6 Slots in use: 6 Oversubscribed: FALSE Num slots allocated: 6 Max slots: 0 Username on node: NULL Num procs: 6 Next node_rank: 6 Data for proc: [[51718,1],120] Pid: 0 Local rank: 0 Node rank: 0 App rank: 120 State: INITIALIZED App_context: 0 Locale: UNKNOWN Binding: [BB/../../../../..] Data for proc: [[51718,1],121] Pid: 0 Local rank: 1 Node rank: 1 App rank: 121 State: INITIALIZED App_context: 0 Locale: UNKNOWN Binding: [../BB/../../../..] Data for proc: [[51718,1],122] Pid: 0 Local rank: 2 Node rank: 2 App rank: 122 State: INITIALIZED App_context: 0 Locale: UNKNOWN Binding: [../../BB/../../..] Data for proc: [[51718,1],123] Pid: 0 Local rank: 3 Node rank: 3 App rank: 123 State: INITIALIZED App_context: 0 Locale: UNKNOWN Binding: [../../../BB/../..] Data for proc: [[51718,1],124] Pid: 0 Local rank: 4 Node rank: 4 App rank: 124 State: INITIALIZED App_context: 0 Locale: UNKNOWN Binding: [../../../../BB/..] Data for proc: [[51718,1],125] Pid: 0 Local rank: 5 Node rank: 5 App rank: 125 State: INITIALIZED App_context: 0 Locale: UNKNOWN Binding: [../../../../../BB] Data for node: csclprd3-0-13 Launch id: -1 State: 0 Daemon: [[51718,0],15] Daemon launched: True Num slots: 12 Slots in use: 6 Oversubscribed: FALSE Num slots allocated: 12 Max slots: 0 Username on node: NULL Num procs: 6 Next node_rank: 6 Data for proc: [[51718,1],126] Pid: 0 Local rank: 0 Node rank: 0 App rank: 126 State: INITIALIZED App_context: 0 Locale: [BB/BB/BB/BB/BB/BB][../../../../../..] Binding: [BB/../../../../..][../../../../../..] Data for proc: [[51718,1],127] Pid: 0 Local rank: 1 Node rank: 1 App rank: 127 State: INITIALIZED App_context: 0 Locale: [../../../../../..][BB/BB/BB/BB/BB/BB] Binding: [../../../../../..][BB/../../../../..] Data for proc: [[51718,1],128] Pid: 0 Local rank: 2 Node rank: 2 App rank: 128 State: INITIALIZED App_context: 0 Locale: [BB/BB/BB/BB/BB/BB][../../../../../..] Binding: [../BB/../../../..][../../../../../..] Data for proc: [[51718,1],129] Pid: 0 Local rank: 3 Node rank: 3 App rank: 129 State: INITIALIZED App_context: 0 Locale: [../../../../../..][BB/BB/BB/BB/BB/BB] Binding: [../../../../../..][../BB/../../../..] Data for proc: [[51718,1],130] Pid: 0 Local rank: 4 Node rank: 4 App rank: 130 State: INITIALIZED App_context: 0 Locale: [BB/BB/BB/BB/BB/BB][../../../../../..] Binding: [../../BB/../../..][../../../../../..] Data for proc: [[51718,1],131] Pid: 0 Local rank: 5 Node rank: 5 App rank: 131 State: INITIALIZED App_context: 0 Locale: [../../../../../..][BB/BB/BB/BB/BB/BB] Binding: [../../../../../..][../../BB/../../..] [csclprd3-0-13:31619] *** Process received signal *** [csclprd3-0-13:31619] Signal: Bus error (7) [csclprd3-0-13:31619] Signal code: Non-existant physical address (2) [csclprd3-0-13:31619] Failing at address: 0x7f1374267a00 [csclprd3-0-13:31620] *** Process received signal *** [csclprd3-0-13:31620] Signal: Bus error (7) [csclprd3-0-13:31620] Signal code: Non-existant physical address (2) [csclprd3-0-13:31620] Failing at address: 0x7fcc702a7980 [csclprd3-0-13:31615] *** Process received signal *** [csclprd3-0-13:31615] Signal: Bus error (7) [csclprd3-0-13:31615] Signal code: Non-existant physical address (2) [csclprd3-0-13:31615] Failing at address: 0x7f8128367880 [csclprd3-0-13:31616] *** Process received signal *** [csclprd3-0-13:31616] Signal: Bus error (7) [csclprd3-0-13:31616] Signal code: Non-existant physical address (2) [csclprd3-0-13:31616] Failing at address: 0x7fe674227a00 [csclprd3-0-13:31617] *** Process received signal *** [csclprd3-0-13:31617] Signal: Bus error (7) [csclprd3-0-13:31617] Signal code: Non-existant physical address (2) [csclprd3-0-13:31617] Failing at address: 0x7f061c32db80 [csclprd3-0-13:31618] *** Process received signal *** [csclprd3-0-13:31618] Signal: Bus error (7) [csclprd3-0-13:31618] Signal code: Non-existant physical address (2) [csclprd3-0-13:31618] Failing at address: 0x7fb8402eaa80 [csclprd3-0-13:31618] [ 0] /lib64/libpthread.so.0(+0xf500)[0x7fb851851500] [csclprd3-0-13:31618] [ 1] [csclprd3-0-13:31616] [ 0] /lib64/libpthread.so.0(+0xf500)[0x7fe6843a4500] [csclprd3-0-13:31616] [ 1] [csclprd3-0-13:31620] [ 0] /lib64/libpthread.so.0(+0xf500)[0x7fcc80c54500] [csclprd3-0-13:31620] [ 1] /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x167f61)[0x7fcc80fc9f61] [csclprd3-0-13:31620] [ 2] /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x168047)[0x7fcc80fca047] [csclprd3-0-13:31620] [ 3] [csclprd3-0-13:31615] [ 0] /lib64/libpthread.so.0(+0xf500)[0x7f81385ca500] [csclprd3-0-13:31615] [ 1] /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x167f61)[0x7f813893ff61] [csclprd3-0-13:31615] [ 2] /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x168047)[0x7f8138940047] [csclprd3-0-13:31615] [ 3] /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x167f61)[0x7fb851bc6f61] [csclprd3-0-13:31618] [ 2] /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x168047)[0x7fb851bc7047] [csclprd3-0-13:31618] [ 3] /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x55670)[0x7fb851ab4670] [csclprd3-0-13:31618] [ 4] [csclprd3-0-13:31617] [ 0] /lib64/libpthread.so.0(+0xf500)[0x7f062cfe5500] [csclprd3-0-13:31617] [ 1] /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x167f61)[0x7f062d35af61] [csclprd3-0-13:31617] [ 2] /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x168047)[0x7f062d35b047] [csclprd3-0-13:31617] [ 3] /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x55670)[0x7f062d248670] [csclprd3-0-13:31617] [ 4] [csclprd3-0-13:31619] [ 0] /lib64/libpthread.so.0(+0xf500)[0x7f1384fde500] [csclprd3-0-13:31619] [ 1] /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x167f61)[0x7f1385353f61] [csclprd3-0-13:31619] [ 2] /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x167f61)[0x7fe684719f61] [csclprd3-0-13:31616] [ 2] /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x168047)[0x7fe68471a047] [csclprd3-0-13:31616] [ 3] /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x55670)[0x7fe684607670] [csclprd3-0-13:31616] [ 4] /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x168047)[0x7f1385354047] [csclprd3-0-13:31619] [ 3] /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x55670)[0x7f1385241670] [csclprd3-0-13:31619] [ 4] /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(ompi_free_list_grow+0x3b9)[0x7f13852425ab] [csclprd3-0-13:31619] [ 5] /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(ompi_free_list_resize_mt+0xfb)[0x7f1385242751] [csclprd3-0-13:31619] [ 6] /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(mca_btl_sm_add_procs+0x671)[0x7f13853501c9] [csclprd3-0-13:31619] [ 7] /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x14a628)[0x7f1385336628] [csclprd3-0-13:31619] [ 8] /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x55670)[0x7fcc80eb7670] [csclprd3-0-13:31620] [ 4] /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(ompi_free_list_grow+0x3b9)[0x7fcc80eb85ab] [csclprd3-0-13:31620] [ 5] /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(ompi_free_list_resize_mt+0xfb)[0x7fcc80eb8751] [csclprd3-0-13:31620] [ 6] /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(mca_btl_sm_add_procs+0x671)[0x7fcc80fc61c9] [csclprd3-0-13:31620] [ 7] /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x14a628)[0x7fcc80fac628] [csclprd3-0-13:31620] [ 8] /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(mca_pml_ob1_add_procs+0xff)[0x7fcc8111fd61] [csclprd3-0-13:31620] [ 9] /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x55670)[0x7f813882d670] [csclprd3-0-13:31615] [ 4] /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(ompi_free_list_grow+0x3b9)[0x7f813882e5ab] [csclprd3-0-13:31615] [ 5] /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(ompi_free_list_resize_mt+0xfb)[0x7f813882e751] [csclprd3-0-13:31615] [ 6] /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(mca_btl_sm_add_procs+0x671)[0x7f813893c1c9] [csclprd3-0-13:31615] [ 7] /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x14a628)[0x7f8138922628] [csclprd3-0-13:31615] [ 8] /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(mca_pml_ob1_add_procs+0xff)[0x7f8138a95d61] [csclprd3-0-13:31615] [ 9] /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(ompi_mpi_init+0xbda)[0x7f813885d747] [csclprd3-0-13:31615] [10] /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(ompi_free_list_grow+0x3b9)[0x7fb851ab55ab] [csclprd3-0-13:31618] [ 5] /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(ompi_free_list_resize_mt+0xfb)[0x7fb851ab5751] [csclprd3-0-13:31618] [ 6] /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(mca_btl_sm_add_procs+0x671)[0x7fb851bc31c9] [csclprd3-0-13:31618] [ 7] /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x14a628)[0x7fb851ba9628] [csclprd3-0-13:31618] [ 8] /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(mca_pml_ob1_add_procs+0xff)[0x7fb851d1cd61] [csclprd3-0-13:31618] [ 9] /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(ompi_mpi_init+0xbda)[0x7fb851ae4747] [csclprd3-0-13:31618] [10] /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(ompi_free_list_grow+0x3b9)[0x7f062d2495ab] [csclprd3-0-13:31617] [ 5] /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(ompi_free_list_resize_mt+0xfb)[0x7f062d249751] [csclprd3-0-13:31617] [ 6] /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(mca_btl_sm_add_procs+0x671)[0x7f062d3571c9] [csclprd3-0-13:31617] [ 7] /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x14a628)[0x7f062d33d628] [csclprd3-0-13:31617] [ 8] /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(mca_pml_ob1_add_procs+0xff)[0x7f062d4b0d61] [csclprd3-0-13:31617] [ 9] /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(ompi_mpi_init+0xbda)[0x7f062d278747] [csclprd3-0-13:31617] [10] /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(ompi_free_list_grow+0x3b9)[0x7fe6846085ab] [csclprd3-0-13:31616] [ 5] /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(ompi_free_list_resize_mt+0xfb)[0x7fe684608751] [csclprd3-0-13:31616] [ 6] /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(mca_btl_sm_add_procs+0x671)[0x7fe6847161c9] [csclprd3-0-13:31616] [ 7] /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x14a628)[0x7fe6846fc628] [csclprd3-0-13:31616] [ 8] /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(mca_pml_ob1_add_procs+0xff)[0x7fe68486fd61] [csclprd3-0-13:31616] [ 9] /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(ompi_mpi_init+0xbda)[0x7fe684637747] [csclprd3-0-13:31616] [10] /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(MPI_Init+0x185)[0x7fe68467750b] [csclprd3-0-13:31616] [11] /hpc/home/lanew/mpi/openmpi/ProcessColors3[0x400ad0] [csclprd3-0-13:31616] [12] /lib64/libc.so.6(__libc_start_main+0xfd)[0x7fe684021cdd] [csclprd3-0-13:31616] [13] /hpc/home/lanew/mpi/openmpi/ProcessColors3[0x400999] [csclprd3-0-13:31616] *** End of error message *** /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(MPI_Init+0x185)[0x7f062d2b850b] [csclprd3-0-13:31617] [11] /hpc/home/lanew/mpi/openmpi/ProcessColors3[0x400ad0] [csclprd3-0-13:31617] [12] /lib64/libc.so.6(__libc_start_main+0xfd)[0x7f062cc62cdd] [csclprd3-0-13:31617] [13] /hpc/home/lanew/mpi/openmpi/ProcessColors3[0x400999] [csclprd3-0-13:31617] *** End of error message *** /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(mca_pml_ob1_add_procs+0xff)[0x7f13854a9d61] [csclprd3-0-13:31619] [ 9] /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(ompi_mpi_init+0xbda)[0x7f1385271747] [csclprd3-0-13:31619] [10] /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(MPI_Init+0x185)[0x7f13852b150b] [csclprd3-0-13:31619] [11] /hpc/home/lanew/mpi/openmpi/ProcessColors3[0x400ad0] [csclprd3-0-13:31619] [12] /lib64/libc.so.6(__libc_start_main+0xfd)[0x7f1384c5bcdd] [csclprd3-0-13:31619] [13] /hpc/home/lanew/mpi/openmpi/ProcessColors3[0x400999] [csclprd3-0-13:31619] *** End of error message *** /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(ompi_mpi_init+0xbda)[0x7fcc80ee7747] [csclprd3-0-13:31620] [10] /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(MPI_Init+0x185)[0x7fcc80f2750b] [csclprd3-0-13:31620] [11] /hpc/home/lanew/mpi/openmpi/ProcessColors3[0x400ad0] [csclprd3-0-13:31620] [12] /lib64/libc.so.6(__libc_start_main+0xfd)[0x7fcc808d1cdd] [csclprd3-0-13:31620] [13] /hpc/home/lanew/mpi/openmpi/ProcessColors3[0x400999] [csclprd3-0-13:31620] *** End of error message *** /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(MPI_Init+0x185)[0x7f813889d50b] [csclprd3-0-13:31615] [11] /hpc/home/lanew/mpi/openmpi/ProcessColors3[0x400ad0] [csclprd3-0-13:31615] [12] /lib64/libc.so.6(__libc_start_main+0xfd)[0x7f8138247cdd] [csclprd3-0-13:31615] [13] /hpc/home/lanew/mpi/openmpi/ProcessColors3[0x400999] [csclprd3-0-13:31615] *** End of error message *** /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(MPI_Init+0x185)[0x7fb851b2450b] [csclprd3-0-13:31618] [11] /hpc/home/lanew/mpi/openmpi/ProcessColors3[0x400ad0] [csclprd3-0-13:31618] [12] /lib64/libc.so.6(__libc_start_main+0xfd)[0x7fb8514cecdd] [csclprd3-0-13:31618] [13] /hpc/home/lanew/mpi/openmpi/ProcessColors3[0x400999] [csclprd3-0-13:31618] *** End of error message *** -------------------------------------------------------------------------- mpirun noticed that process rank 126 with PID 0 on node csclprd3-0-13 exited on signal 7 (Bus error). -------------------------------------------------------------------------- ________________________________ From: users [users-boun...@open-mpi.org] on behalf of Ralph Castain [r...@open-mpi.org] Sent: Tuesday, June 23, 2015 6:20 PM To: Open MPI Users Subject: Re: [OMPI users] OpenMPI 1.8.6, CentOS 6.3, too many slots = crash Wow - that is one sick puppy! I see that some nodes are reporting not-bound for their procs, and the rest are binding to socket (as they should). Some of your nodes clearly do not have hyper threads enabled (or only have single-thread cores on them), and have 2 cores/socket. Other nodes have 8 cores/socket with hyper threads enabled, while still others have 6 cores/socket and HT enabled. I don't see anyone binding to a single HT if multiple HTs/core are available. I think you are being fooled by those nodes that either don't have HT enabled, or have only 1 HT/core. In both cases, it is node 13 that is the node that fails. I also note that you said everything works okay with < 132 ranks, and node 13 hosts ranks 127-131. So node 13 would host ranks even if you reduced the number in the job to 131. This would imply that it probably isn't something wrong with the node itself. Is there any way you could run a job of this size on a homogeneous cluster? The procs all show bindings that look right, but I'm wondering if the heterogeneity is the source of the trouble here. We may be communicating the binding pattern incorrectly and giving bad info to the backend daemon. Also, rather than --report-bindings, use "--display-devel-map" on the command line and let's see what the mapper thinks it did. If there is a problem with placement, that is where it would exist. On Tue, Jun 23, 2015 at 5:12 PM, Lane, William <william.l...@cshs.org> wrote: Ralph, There is something funny going on, the trace from the runs w/the debug build aren't showing any differences from what I got earlier. However, I did do a run w/the --bind-to core switch and was surprised to see that hyperthreading cores were sometimes being used. Here's the traces that I have: mpirun -np 132 -report-bindings --prefix /hpc/apps/mpi/openmpi/1.8.6/ --hostfile hostfile-noslots --mca btl_tcp_if_include eth0 --hetero-nodes /hpc/home/lanew/mpi/openmpi/ProcessColors3 [csclprd3-0-5:16802] MCW rank 44 is not bound (or bound to all available processors) [csclprd3-0-5:16802] MCW rank 45 is not bound (or bound to all available processors) [csclprd3-0-5:16802] MCW rank 46 is not bound (or bound to all available processors) [csclprd3-6-5:12480] MCW rank 4 bound to socket 0[core 0[hwt 0]], socket 0[core 1[hwt 0]]: [B/B][./.] [csclprd3-6-5:12480] MCW rank 5 bound to socket 1[core 2[hwt 0]], socket 1[core 3[hwt 0]]: [./.][B/B] [csclprd3-6-5:12480] MCW rank 6 bound to socket 0[core 0[hwt 0]], socket 0[core 1[hwt 0]]: [B/B][./.] [csclprd3-6-5:12480] MCW rank 7 bound to socket 1[core 2[hwt 0]], socket 1[core 3[hwt 0]]: [./.][B/B] [csclprd3-0-5:16802] MCW rank 47 is not bound (or bound to all available processors) [csclprd3-0-5:16802] MCW rank 48 is not bound (or bound to all available processors) [csclprd3-0-5:16802] MCW rank 49 is not bound (or bound to all available processors) [csclprd3-0-1:14318] MCW rank 22 is not bound (or bound to all available processors) [csclprd3-0-1:14318] MCW rank 23 is not bound (or bound to all available processors) [csclprd3-0-1:14318] MCW rank 24 is not bound (or bound to all available processors) [csclprd3-6-1:24682] MCW rank 3 bound to socket 1[core 2[hwt 0]], socket 1[core 3[hwt 0]]: [./.][B/B] [csclprd3-6-1:24682] MCW rank 0 bound to socket 0[core 0[hwt 0]], socket 0[core 1[hwt 0]]: [B/B][./.] [csclprd3-0-1:14318] MCW rank 25 is not bound (or bound to all available processors) [csclprd3-0-1:14318] MCW rank 20 is not bound (or bound to all available processors) [csclprd3-0-3:13827] MCW rank 34 is not bound (or bound to all available processors) [csclprd3-0-1:14318] MCW rank 21 is not bound (or bound to all available processors) [csclprd3-0-3:13827] MCW rank 35 is not bound (or bound to all available processors) [csclprd3-6-1:24682] MCW rank 1 bound to socket 1[core 2[hwt 0]], socket 1[core 3[hwt 0]]: [./.][B/B] [csclprd3-0-3:13827] MCW rank 36 is not bound (or bound to all available processors) [csclprd3-6-1:24682] MCW rank 2 bound to socket 0[core 0[hwt 0]], socket 0[core 1[hwt 0]]: [B/B][./.] [csclprd3-0-6:30371] MCW rank 51 is not bound (or bound to all available processors) [csclprd3-0-6:30371] MCW rank 52 is not bound (or bound to all available processors) [csclprd3-0-6:30371] MCW rank 53 is not bound (or bound to all available processors) [csclprd3-0-2:05825] MCW rank 30 is not bound (or bound to all available processors) [csclprd3-0-6:30371] MCW rank 54 is not bound (or bound to all available processors) [csclprd3-0-3:13827] MCW rank 37 is not bound (or bound to all available processors) [csclprd3-0-2:05825] MCW rank 31 is not bound (or bound to all available processors) [csclprd3-0-3:13827] MCW rank 32 is not bound (or bound to all available processors) [csclprd3-0-6:30371] MCW rank 55 is not bound (or bound to all available processors) [csclprd3-0-3:13827] MCW rank 33 is not bound (or bound to all available processors) [csclprd3-0-6:30371] MCW rank 50 is not bound (or bound to all available processors) [csclprd3-0-2:05825] MCW rank 26 is not bound (or bound to all available processors) [csclprd3-0-2:05825] MCW rank 27 is not bound (or bound to all available processors) [csclprd3-0-2:05825] MCW rank 28 is not bound (or bound to all available processors) [csclprd3-0-2:05825] MCW rank 29 is not bound (or bound to all available processors) [csclprd3-0-12:12383] MCW rank 121 is not bound (or bound to all available processors) [csclprd3-0-12:12383] MCW rank 122 is not bound (or bound to all available processors) [csclprd3-0-12:12383] MCW rank 123 is not bound (or bound to all available processors) [csclprd3-0-12:12383] MCW rank 124 is not bound (or bound to all available processors) [csclprd3-0-12:12383] MCW rank 125 is not bound (or bound to all available processors) [csclprd3-0-12:12383] MCW rank 120 is not bound (or bound to all available processors) [csclprd3-0-0:31079] MCW rank 13 bound to socket 1[core 6[hwt 0]], socket 1[core 7[hwt 0]], socket 1[core 8[hwt 0]], socket 1[core 9[hwt 0]], socket 1[core 10[hwt 0]], socket 1[core 11[hwt 0]]: [./././././.][B/B/B/B/B/B] [csclprd3-0-0:31079] MCW rank 14 bound to socket 0[core 0[hwt 0]], socket 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], socket 0[core 3[hwt 0]], socket 0[core 4[hwt 0]], socket 0[core 5[hwt 0]]: [B/B/B/B/B/B][./././././.] [csclprd3-0-0:31079] MCW rank 15 bound to socket 1[core 6[hwt 0]], socket 1[core 7[hwt 0]], socket 1[core 8[hwt 0]], socket 1[core 9[hwt 0]], socket 1[core 10[hwt 0]], socket 1[core 11[hwt 0]]: [./././././.][B/B/B/B/B/B] [csclprd3-0-0:31079] MCW rank 16 bound to socket 0[core 0[hwt 0]], socket 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], socket 0[core 3[hwt 0]], socket 0[core 4[hwt 0]], socket 0[core 5[hwt 0]]: [B/B/B/B/B/B][./././././.] [csclprd3-0-7:20515] MCW rank 68 bound to socket 0[core 0[hwt 0-1]], socket 0[core 1[hwt 0-1]], socket 0[core 2[hwt 0-1]], socket 0[core 3[hwt 0-1]], socket 0[core 4[hwt 0-1]], socket 0[core 5[hwt 0-1]], socket 0[core 6[hwt 0-1]], socket 0[core 7[hwt 0-1]]: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] [csclprd3-0-10:19096] MCW rank 100 bound to socket 0[core 0[hwt 0-1]], socket 0[core 1[hwt 0-1]], socket 0[core 2[hwt 0-1]], socket 0[core 3[hwt 0-1]], socket 0[core 4[hwt 0-1]], socket 0[core 5[hwt 0-1]], socket 0[core 6[hwt 0-1]], socket 0[core 7[hwt 0-1]]: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] [csclprd3-0-7:20515] MCW rank 69 bound to socket 1[core 8[hwt 0-1]], socket 1[core 9[hwt 0-1]], socket 1[core 10[hwt 0-1]], socket 1[core 11[hwt 0-1]], socket 1[core 12[hwt 0-1]], socket 1[core 13[hwt 0-1]], socket 1[core 14[hwt 0-1]], socket 1[core 15[hwt 0-1]]: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] [csclprd3-0-10:19096] MCW rank 101 bound to socket 1[core 8[hwt 0-1]], socket 1[core 9[hwt 0-1]], socket 1[core 10[hwt 0-1]], socket 1[core 11[hwt 0-1]], socket 1[core 12[hwt 0-1]], socket 1[core 13[hwt 0-1]], socket 1[core 14[hwt 0-1]], socket 1[core 15[hwt 0-1]]: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] [csclprd3-0-0:31079] MCW rank 17 bound to socket 1[core 6[hwt 0]], socket 1[core 7[hwt 0]], socket 1[core 8[hwt 0]], socket 1[core 9[hwt 0]], socket 1[core 10[hwt 0]], socket 1[core 11[hwt 0]]: [./././././.][B/B/B/B/B/B] [csclprd3-0-7:20515] MCW rank 70 bound to socket 0[core 0[hwt 0-1]], socket 0[core 1[hwt 0-1]], socket 0[core 2[hwt 0-1]], socket 0[core 3[hwt 0-1]], socket 0[core 4[hwt 0-1]], socket 0[core 5[hwt 0-1]], socket 0[core 6[hwt 0-1]], socket 0[core 7[hwt 0-1]]: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] [csclprd3-0-10:19096] MCW rank 102 bound to socket 0[core 0[hwt 0-1]], socket 0[core 1[hwt 0-1]], socket 0[core 2[hwt 0-1]], socket 0[core 3[hwt 0-1]], socket 0[core 4[hwt 0-1]], socket 0[core 5[hwt 0-1]], socket 0[core 6[hwt 0-1]], socket 0[core 7[hwt 0-1]]: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] [csclprd3-0-11:31636] MCW rank 116 bound to socket 0[core 0[hwt 0-1]], socket 0[core 1[hwt 0-1]], socket 0[core 2[hwt 0-1]], socket 0[core 3[hwt 0-1]], socket 0[core 4[hwt 0-1]], socket 0[core 5[hwt 0-1]], socket 0[core 6[hwt 0-1]], socket 0[core 7[hwt 0-1]]: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] [csclprd3-0-11:31636] MCW rank 117 bound to socket 1[core 8[hwt 0-1]], socket 1[core 9[hwt 0-1]], socket 1[core 10[hwt 0-1]], socket 1[core 11[hwt 0-1]], socket 1[core 12[hwt 0-1]], socket 1[core 13[hwt 0-1]], socket 1[core 14[hwt 0-1]], socket 1[core 15[hwt 0-1]]: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] [csclprd3-0-0:31079] MCW rank 18 bound to socket 0[core 0[hwt 0]], socket 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], socket 0[core 3[hwt 0]], socket 0[core 4[hwt 0]], socket 0[core 5[hwt 0]]: [B/B/B/B/B/B][./././././.] [csclprd3-0-11:31636] MCW rank 118 bound to socket 0[core 0[hwt 0-1]], socket 0[core 1[hwt 0-1]], socket 0[core 2[hwt 0-1]], socket 0[core 3[hwt 0-1]], socket 0[core 4[hwt 0-1]], socket 0[core 5[hwt 0-1]], socket 0[core 6[hwt 0-1]], socket 0[core 7[hwt 0-1]]: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] [csclprd3-0-0:31079] MCW rank 19 bound to socket 1[core 6[hwt 0]], socket 1[core 7[hwt 0]], socket 1[core 8[hwt 0]], socket 1[core 9[hwt 0]], socket 1[core 10[hwt 0]], socket 1[core 11[hwt 0]]: [./././././.][B/B/B/B/B/B] [csclprd3-0-7:20515] MCW rank 71 bound to socket 1[core 8[hwt 0-1]], socket 1[core 9[hwt 0-1]], socket 1[core 10[hwt 0-1]], socket 1[core 11[hwt 0-1]], socket 1[core 12[hwt 0-1]], socket 1[core 13[hwt 0-1]], socket 1[core 14[hwt 0-1]], socket 1[core 15[hwt 0-1]]: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] [csclprd3-0-10:19096] MCW rank 103 bound to socket 1[core 8[hwt 0-1]], socket 1[core 9[hwt 0-1]], socket 1[core 10[hwt 0-1]], socket 1[core 11[hwt 0-1]], socket 1[core 12[hwt 0-1]], socket 1[core 13[hwt 0-1]], socket 1[core 14[hwt 0-1]], socket 1[core 15[hwt 0-1]]: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] [csclprd3-0-0:31079] MCW rank 8 bound to socket 0[core 0[hwt 0]], socket 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], socket 0[core 3[hwt 0]], socket 0[core 4[hwt 0]], socket 0[core 5[hwt 0]]: [B/B/B/B/B/B][./././././.] [csclprd3-0-0:31079] MCW rank 9 bound to socket 1[core 6[hwt 0]], socket 1[core 7[hwt 0]], socket 1[core 8[hwt 0]], socket 1[core 9[hwt 0]], socket 1[core 10[hwt 0]], socket 1[core 11[hwt 0]]: [./././././.][B/B/B/B/B/B] [csclprd3-0-10:19096] MCW rank 88 bound to socket 0[core 0[hwt 0-1]], socket 0[core 1[hwt 0-1]], socket 0[core 2[hwt 0-1]], socket 0[core 3[hwt 0-1]], socket 0[core 4[hwt 0-1]], socket 0[core 5[hwt 0-1]], socket 0[core 6[hwt 0-1]], socket 0[core 7[hwt 0-1]]: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] [csclprd3-0-11:31636] MCW rank 119 bound to socket 1[core 8[hwt 0-1]], socket 1[core 9[hwt 0-1]], socket 1[core 10[hwt 0-1]], socket 1[core 11[hwt 0-1]], socket 1[core 12[hwt 0-1]], socket 1[core 13[hwt 0-1]], socket 1[core 14[hwt 0-1]], socket 1[core 15[hwt 0-1]]: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] [csclprd3-0-7:20515] MCW rank 56 bound to socket 0[core 0[hwt 0-1]], socket 0[core 1[hwt 0-1]], socket 0[core 2[hwt 0-1]], socket 0[core 3[hwt 0-1]], socket 0[core 4[hwt 0-1]], socket 0[core 5[hwt 0-1]], socket 0[core 6[hwt 0-1]], socket 0[core 7[hwt 0-1]]: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] [csclprd3-0-0:31079] MCW rank 10 bound to socket 0[core 0[hwt 0]], socket 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], socket 0[core 3[hwt 0]], socket 0[core 4[hwt 0]], socket 0[core 5[hwt 0]]: [B/B/B/B/B/B][./././././.] [csclprd3-0-7:20515] MCW rank 57 bound to socket 1[core 8[hwt 0-1]], socket 1[core 9[hwt 0-1]], socket 1[core 10[hwt 0-1]], socket 1[core 11[hwt 0-1]], socket 1[core 12[hwt 0-1]], socket 1[core 13[hwt 0-1]], socket 1[core 14[hwt 0-1]], socket 1[core 15[hwt 0-1]]: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] [csclprd3-0-10:19096] MCW rank 89 bound to socket 1[core 8[hwt 0-1]], socket 1[core 9[hwt 0-1]], socket 1[core 10[hwt 0-1]], socket 1[core 11[hwt 0-1]], socket 1[core 12[hwt 0-1]], socket 1[core 13[hwt 0-1]], socket 1[core 14[hwt 0-1]], socket 1[core 15[hwt 0-1]]: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] [csclprd3-0-11:31636] MCW rank 104 bound to socket 0[core 0[hwt 0-1]], socket 0[core 1[hwt 0-1]], socket 0[core 2[hwt 0-1]], socket 0[core 3[hwt 0-1]], socket 0[core 4[hwt 0-1]], socket 0[core 5[hwt 0-1]], socket 0[core 6[hwt 0-1]], socket 0[core 7[hwt 0-1]]: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] [csclprd3-0-0:31079] MCW rank 11 bound to socket 1[core 6[hwt 0]], socket 1[core 7[hwt 0]], socket 1[core 8[hwt 0]], socket 1[core 9[hwt 0]], socket 1[core 10[hwt 0]], socket 1[core 11[hwt 0]]: [./././././.][B/B/B/B/B/B] [csclprd3-0-0:31079] MCW rank 12 bound to socket 0[core 0[hwt 0]], socket 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], socket 0[core 3[hwt 0]], socket 0[core 4[hwt 0]], socket 0[core 5[hwt 0]]: [B/B/B/B/B/B][./././././.] [csclprd3-0-4:30348] MCW rank 42 is not bound (or bound to all _______________________________________________ users mailing list us...@open-mpi.org<mailto:us...@open-mpi.org> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users Link to this post: http://www.open-mpi.org/community/lists/users/2015/06/27185.php IMPORTANT WARNING: This message is intended for the use of the person or entity to which it is addressed and may contain information that is privileged and confidential, the disclosure of which is governed by applicable law. If the reader of this message is not the intended recipient, or the employee or agent responsible for delivering it to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this information is strictly prohibited. Thank you for your cooperation. _______________________________________________ users mailing list us...@open-mpi.org<mailto:us...@open-mpi.org> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users Link to this post: http://www.open-mpi.org/community/lists/users/2015/06/27204.php IMPORTANT WARNING: This message is intended for the use of the person or entity to which it is addressed and may contain information that is privileged and confidential, the disclosure of which is governed by applicable law. If the reader of this message is not the intended recipient, or the employee or agent responsible for delivering it to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this information is strictly prohibited. Thank you for your cooperation.