Can you give it a try? I’m skeptical, but it might work. The rc is out on the web site:
http://www.open-mpi.org/software/ompi/v1.8/ <http://www.open-mpi.org/software/ompi/v1.8/> > On Jul 14, 2015, at 11:17 AM, Lane, William <william.l...@cshs.org> wrote: > > Ralph, > > Do you think the 1.8.7 release will solve the problems w/our > heterogeneous cluster? > > Bill L. > > From: users [users-boun...@open-mpi.org] on behalf of Ralph Castain > [r...@open-mpi.org] > Sent: Tuesday, July 07, 2015 8:59 PM > To: Open MPI Users > Subject: Re: [OMPI users] OpenMPI 1.8.6, CentOS 6.3, too many slots = crash > > No need for the lstopo data anymore, Bill - I was able to recreate the > situation using some very nice hwloc functions plus your prior descriptions. > I'm not totally confident that this fix will resolve the problem but it will > clear out at least one problem. > > We'll just have to see what happens and attack it next. > Ralph > > > On Tue, Jul 7, 2015 at 8:07 PM, Lane, William <william.l...@cshs.org > <mailto:william.l...@cshs.org>> wrote: > I'm sorry I haven't been able to get the lstopo information for > all the nodes, but I had to get the latest version of hwloc installed > first. They've even added in some more modern blades that also > support hyperthreading, ugh. They've also been doing some memory > upgrades as well. > > I'm trying to get a Bash script running on the cluster via qsub > that will run lstopo and output the host information to a file located > in my $HOME directory but it hasn't been working (there are 60 nodes > in the heterogeneous cluster that needs to have OpenMPI running). > > I will try to get the lstopo information by the end of the week. > > I'd be willing to do most anything to get these OpenMPI issues > resolved. I'd even wash your cars for you! > > -Bill L. > ________________________________________ > From: users [users-boun...@open-mpi.org <mailto:users-boun...@open-mpi.org>] > on behalf of Ralph Castain [r...@open-mpi.org <mailto:r...@open-mpi.org>] > Sent: Tuesday, July 07, 2015 1:36 PM > To: Open MPI Users > Subject: Re: [OMPI users] OpenMPI 1.8.6, CentOS 6.3, too many slots = crash > > I may have finally tracked this down. At least, I can now get the correct > devel map to come out, and found a memory corruption issue that only impacted > hetero operations. I can’t know if this is the root cause of the problem Bill > is seeing, however, as I have no way of actually running the job. > > I pushed this into the master and will bring it back to 1.8.7 as well as 1.10. > > Bill - would you be able/willing to give it a try there? It would be nice to > confirm this actually fixed the problem. > > > > On Jun 29, 2015, at 1:58 PM, Jeff Squyres (jsquyres) <jsquy...@cisco.com > > <mailto:jsquy...@cisco.com>> wrote: > > > > lstopo will tell you -- if there is more than one "PU" (hwloc terminology > > for "processing unit") per core, then hyper threading is enabled. If > > there's only one PU per core, then hyper threading is disabled. > > > > > >> On Jun 29, 2015, at 4:42 PM, Lane, William <william.l...@cshs.org > >> <mailto:william.l...@cshs.org>> wrote: > >> > >> Would the output of dmidecode -t processor and/or lstopo tell me > >> conclusively > >> if hyperthreading is enabled or not? Hyperthreading is supposed to be > >> enabled > >> for all the IBM x3550 M3 and M4 nodes, but I'm not sure if it actually is > >> and I > >> don't have access to the BIOS settings. > >> > >> -Bill L. > >> > >> From: users [users-boun...@open-mpi.org > >> <mailto:users-boun...@open-mpi.org>] on behalf of Ralph Castain > >> [r...@open-mpi.org <mailto:r...@open-mpi.org>] > >> Sent: Saturday, June 27, 2015 7:21 PM > >> To: Open MPI Users > >> Subject: Re: [OMPI users] OpenMPI 1.8.6, CentOS 6.3, too many slots = crash > >> > >> Bill - this is such a jumbled collection of machines that I’m having > >> trouble figuring out what I should replicate. I can create something > >> artificial here so I can try to debug this, but I need to know exactly > >> what I’m up against - can you tell me: > >> > >> * the architecture of each type - how many sockets, how many cores/socket, > >> HT on or off. If two nodes have the same physical setup but one has HT on > >> and the other off, then please consider those two different types > >> > >> * how many nodes of each type > >> > >> Looking at your map output, it looks like the map is being done correctly, > >> but somehow the binding locale isn’t getting set in some cases. You latest > >> error output would seem out-of-step with your prior reports, so something > >> else may be going on there. As I said earlier, this is the most hetero > >> environment we’ve seen, and so there may be some code paths your hitting > >> that haven’t been well exercised before. > >> > >> > >> > >> > >>> On Jun 26, 2015, at 5:22 PM, Lane, William <william.l...@cshs.org > >>> <mailto:william.l...@cshs.org>> wrote: > >>> > >>> Well, I managed to get a successful mpirun @ a slot count of 132 using > >>> --mca btl ^sm, > >>> however when I increased the slot count to 160, mpirun crashed without > >>> any error > >>> output: > >>> > >>> mpirun -np 160 -display-devel-map --prefix /hpc/apps/mpi/openmpi/1.8.6/ > >>> --hostfile hostfile-noslots --mca btl ^sm --hetero-nodes --bind-to core > >>> /hpc/home/lanew/mpi/openmpi/ProcessColors3 >> out.txt 2>&1 > >>> > >>> -------------------------------------------------------------------------- > >>> WARNING: a request was made to bind a process. While the system > >>> supports binding the process itself, at least one node does NOT > >>> support binding memory to the process location. > >>> > >>> Node: csclprd3-6-1 > >>> > >>> This usually is due to not having the required NUMA support installed > >>> on the node. In some Linux distributions, the required support is > >>> contained in the libnumactl and libnumactl-devel packages. > >>> This is a warning only; your job will continue, though performance may be > >>> degraded. > >>> -------------------------------------------------------------------------- > >>> -------------------------------------------------------------------------- > >>> A request was made to bind to that would result in binding more > >>> processes than cpus on a resource: > >>> > >>> Bind to: CORE > >>> Node: csclprd3-6-1 > >>> #processes: 2 > >>> #cpus: 1 > >>> > >>> You can override this protection by adding the "overload-allowed" > >>> option to your binding directive. > >>> -------------------------------------------------------------------------- > >>> > >>> But csclprd3-6-1 (a blade) does have 2 CPU's on 2 separate sockets w/2 > >>> cores apiece as shown in my dmidecode output: > >>> > >>> csclprd3-6-1 ~]# dmidecode -t processor > >>> # dmidecode 2.11 > >>> SMBIOS 2.4 present. > >>> > >>> Handle 0x0008, DMI type 4, 32 bytes > >>> Processor Information > >>> Socket Designation: Socket 1 CPU 1 > >>> Type: Central Processor > >>> Family: Xeon > >>> Manufacturer: GenuineIntel > >>> ID: F6 06 00 00 01 03 00 00 > >>> Signature: Type 0, Family 6, Model 15, Stepping 6 > >>> Flags: > >>> FPU (Floating-point unit on-chip) > >>> CX8 (CMPXCHG8 instruction supported) > >>> APIC (On-chip APIC hardware supported) > >>> Version: Intel Xeon > >>> Voltage: 2.9 V > >>> External Clock: 333 MHz > >>> Max Speed: 4000 MHz > >>> Current Speed: 3000 MHz > >>> Status: Populated, Enabled > >>> Upgrade: ZIF Socket > >>> L1 Cache Handle: 0x0004 > >>> L2 Cache Handle: 0x0005 > >>> L3 Cache Handle: Not Provided > >>> > >>> Handle 0x0009, DMI type 4, 32 bytes > >>> Processor Information > >>> Socket Designation: Socket 2 CPU 2 > >>> Type: Central Processor > >>> Family: Xeon > >>> Manufacturer: GenuineIntel > >>> ID: F6 06 00 00 01 03 00 00 > >>> Signature: Type 0, Family 6, Model 15, Stepping 6 > >>> Flags: > >>> FPU (Floating-point unit on-chip) > >>> CX8 (CMPXCHG8 instruction supported) > >>> APIC (On-chip APIC hardware supported) > >>> Version: Intel Xeon > >>> Voltage: 2.9 V > >>> External Clock: 333 MHz > >>> Max Speed: 4000 MHz > >>> Current Speed: 3000 MHz > >>> Status: Populated, Enabled > >>> Upgrade: ZIF Socket > >>> L1 Cache Handle: 0x0006 > >>> L2 Cache Handle: 0x0007 > >>> L3 Cache Handle: Not Provided > >>> > >>> csclprd3-6-1 ~]# lstopo > >>> Machine (16GB) > >>> Socket L#0 + L2 L#0 (4096KB) > >>> L1d L#0 (32KB) + L1i L#0 (32KB) + Core L#0 + PU L#0 (P#0) > >>> L1d L#1 (32KB) + L1i L#1 (32KB) + Core L#1 + PU L#1 (P#2) > >>> Socket L#1 + L2 L#1 (4096KB) > >>> L1d L#2 (32KB) + L1i L#2 (32KB) + Core L#2 + PU L#2 (P#1) > >>> L1d L#3 (32KB) + L1i L#3 (32KB) + Core L#3 + PU L#3 (P#3) > >>> > >>> csclprd3-0-1 information (which looks correct as this particular x3550 > >>> should > >>> have one socket populated (of two) with a 6 core Xeon (or 12 cores > >>> w/hyperthreading > >>> turned on): > >>> > >>> csclprd3-0-1 ~]# lstopo > >>> Machine (71GB) > >>> Socket L#0 + L3 L#0 (12MB) > >>> L2 L#0 (256KB) + L1d L#0 (32KB) + L1i L#0 (32KB) + Core L#0 + PU > >>> L#0 (P#0) > >>> L2 L#1 (256KB) + L1d L#1 (32KB) + L1i L#1 (32KB) + Core L#1 + PU > >>> L#1 (P#1) > >>> L2 L#2 (256KB) + L1d L#2 (32KB) + L1i L#2 (32KB) + Core L#2 + PU > >>> L#2 (P#2) > >>> L2 L#3 (256KB) + L1d L#3 (32KB) + L1i L#3 (32KB) + Core L#3 + PU > >>> L#3 (P#3) > >>> L2 L#4 (256KB) + L1d L#4 (32KB) + L1i L#4 (32KB) + Core L#4 + PU > >>> L#4 (P#4) > >>> L2 L#5 (256KB) + L1d L#5 (32KB) + L1i L#5 (32KB) + Core L#5 + PU > >>> L#5 (P#5) > >>> > >>> csclprd3-0-1 ~]# dmidecode -t processor > >>> # dmidecode 2.11 > >>> # SMBIOS entry point at 0x7f6be000 > >>> SMBIOS 2.5 present. > >>> > >>> Handle 0x0001, DMI type 4, 40 bytes > >>> Processor Information > >>> Socket Designation: Node 1 Socket 1 > >>> Type: Central Processor > >>> Family: Xeon MP > >>> Manufacturer: Intel(R) Corporation > >>> ID: C2 06 02 00 FF FB EB BF > >>> Signature: Type 0, Family 6, Model 44, Stepping 2 > >>> Flags: > >>> FPU (Floating-point unit on-chip) > >>> VME (Virtual mode extension) > >>> DE (Debugging extension) > >>> PSE (Page size extension) > >>> TSC (Time stamp counter) > >>> MSR (Model specific registers) > >>> PAE (Physical address extension) > >>> MCE (Machine check exception) > >>> CX8 (CMPXCHG8 instruction supported) > >>> APIC (On-chip APIC hardware supported) > >>> SEP (Fast system call) > >>> MTRR (Memory type range registers) > >>> PGE (Page global enable) > >>> MCA (Machine check architecture) > >>> CMOV (Conditional move instruction supported) > >>> PAT (Page attribute table) > >>> PSE-36 (36-bit page size extension) > >>> CLFSH (CLFLUSH instruction supported) > >>> DS (Debug store) > >>> ACPI (ACPI supported) > >>> MMX (MMX technology supported) > >>> FXSR (FXSAVE and FXSTOR instructions supported) > >>> SSE (Streaming SIMD extensions) > >>> SSE2 (Streaming SIMD extensions 2) > >>> SS (Self-snoop) > >>> HTT (Multi-threading) > >>> TM (Thermal monitor supported) > >>> PBE (Pending break enabled) > >>> Version: Intel(R) Xeon(R) CPU E5645 @ 2.40GHz > >>> Voltage: 1.2 V > >>> External Clock: 5866 MHz > >>> Max Speed: 4400 MHz > >>> Current Speed: 2400 MHz > >>> Status: Populated, Enabled > >>> Upgrade: ZIF Socket > >>> L1 Cache Handle: 0x0002 > >>> L2 Cache Handle: 0x0003 > >>> L3 Cache Handle: 0x0004 > >>> Serial Number: Not Specified > >>> Asset Tag: Not Specified > >>> Part Number: Not Specified > >>> Core Count: 6 > >>> Core Enabled: 6 > >>> Thread Count: 6 > >>> Characteristics: > >>> 64-bit capable > >>> > >>> Handle 0x005A, DMI type 4, 40 bytes > >>> Processor Information > >>> Socket Designation: Node 1 Socket 2 > >>> Type: Central Processor > >>> Family: Xeon MP > >>> Manufacturer: Not Specified > >>> ID: 00 00 00 00 00 00 00 00 > >>> Signature: Type 0, Family 0, Model 0, Stepping 0 > >>> Flags: None > >>> Version: Not Specified > >>> Voltage: 1.2 V > >>> External Clock: 5866 MHz > >>> Max Speed: 4400 MHz > >>> Current Speed: Unknown > >>> Status: Unpopulated > >>> Upgrade: ZIF Socket > >>> L1 Cache Handle: Not Provided > >>> L2 Cache Handle: Not Provided > >>> L3 Cache Handle: Not Provided > >>> Serial Number: Not Specified > >>> Asset Tag: Not Specified > >>> Part Number: Not Specified > >>> Characteristics: None > >>> > >>> > >>> From: users [users-boun...@open-mpi.org > >>> <mailto:users-boun...@open-mpi.org>] on behalf of Ralph Castain > >>> [r...@open-mpi.org <mailto:r...@open-mpi.org>] > >>> Sent: Wednesday, June 24, 2015 6:06 AM > >>> To: Open MPI Users > >>> Subject: Re: [OMPI users] OpenMPI 1.8.6, CentOS 6.3, too many slots = > >>> crash > >>> > >>> I think trying with --mca btl ^sm makes a lot of sense and may solve the > >>> problem. I also noted that we are having trouble with the topology of > >>> several of the nodes - seeing only one socket, non-HT where you say we > >>> should see two sockets and HT-enabled. In those cases, the locality is > >>> "unknown" - given that those procs are on remote nodes from the one being > >>> impacted, I don't think it should cause a problem. However, it isn't > >>> correct, and that raises flags. > >>> > >>> My best guess of the root cause of that error is either we are getting > >>> bad topology info on those nodes, or we have a bug that is mishandling > >>> this scenario. It would probably be good to get this error fixed to > >>> ensure it isn't the source of the eventual crash, even though I'm not > >>> sure they are related. > >>> > >>> Bill: Can we examine one of the problem nodes? Let's pick csclprd3-0-1 > >>> (or take another one from your list - just look for one where "locality" > >>> is reported as "unknown" for the procs in the output map). Can you run > >>> lstopo on that node and send us the output? In the above map, it is > >>> reporting a single socket with 6 cores, non-HT. Is that what lstopo shows > >>> when run on the node? Is it what you expected? > >>> > >>> > >>> On Wed, Jun 24, 2015 at 4:07 AM, Gilles Gouaillardet > >>> <gilles.gouaillar...@gmail.com <mailto:gilles.gouaillar...@gmail.com>> > >>> wrote: > >>> Bill, > >>> > >>> were you able to get a core file and analyze the stack with gdb ? > >>> > >>> I suspect the error occurs in mca_btl_sm_add_procs but this is just my > >>> best guess. > >>> if this is correct, can you check the value of > >>> mca_btl_sm_component.num_smp_procs ? > >>> > >>> as a workaround, can you try > >>> mpirun --mca btl ^sm ... > >>> > >>> I do not see how I can tackle the root cause without being able to > >>> reproduce the issue :-( > >>> > >>> can you try to reproduce the issue with the smallest hostfile, and then > >>> run lstopo on all the nodes ? > >>> btw, you are not mixing 32 bits and 64 bits OS, are you ? > >>> > >>> Cheers, > >>> > >>> Gilles > >>> > >>> > >>> > >>> > >>> mca_btl_sm_add_procs( > >>> > >>> > >>> > >>> int > >>> > >>> mca_btl_sm_add_procs > >>> ( > >>> On Wednesday, June 24, 2015, Lane, William <william.l...@cshs.org > >>> <mailto:william.l...@cshs.org>> wrote: > >>> Gilles, > >>> > >>> All the blades only have two core Xeons (without hyperthreading) > >>> populating both their sockets. All > >>> the x3550 nodes have hyperthreading capable Xeons and Sandybridge server > >>> CPU's. It's possible > >>> hyperthreading has been disabled on some of these nodes though. The 3-0-n > >>> nodes are all IBM x3550 > >>> nodes while the 3-6-n nodes are all blade nodes. > >>> > >>> I have run this exact same test code successfully in the past on another > >>> cluster (~200 nodes of Sunfire X2100 > >>> 2x dual-core Opterons) w/no issues on upwards of 390 slots. I even tested > >>> it recently on OpenMPI 1.8.5 > >>> on another smaller R&D cluster consisting of 10 Sunfire X2100 nodes (w/2 > >>> dual core Opterons apiece). > >>> On this particular cluster I've had success running this code on < 132 > >>> slots. > >>> > >>> Anyway, here's the results of the following mpirun: > >>> > >>> mpirun -np 132 -display-devel-map --prefix /hpc/apps/mpi/openmpi/1.8.6/ > >>> --hostfile hostfile-noslots --mca btl_tcp_if_include eth0 --hetero-nodes > >>> --bind-to core /hpc/home/lanew/mpi/openmpi/ProcessColors3 >> out.txt 2>&1 > >>> > >>> -------------------------------------------------------------------------- > >>> WARNING: a request was made to bind a process. While the system > >>> supports binding the process itself, at least one node does NOT > >>> support binding memory to the process location. > >>> > >>> Node: csclprd3-6-1 > >>> > >>> This usually is due to not having the required NUMA support installed > >>> on the node. In some Linux distributions, the required support is > >>> contained in the libnumactl and libnumactl-devel packages. > >>> This is a warning only; your job will continue, though performance may be > >>> degraded. > >>> -------------------------------------------------------------------------- > >>> Data for JOB [51718,1] offset 0 > >>> > >>> Mapper requested: NULL Last mapper: round_robin Mapping policy: > >>> BYSOCKET Ranking policy: SLOT > >>> Binding policy: CORE Cpu set: NULL PPR: NULL Cpus-per-rank: 1 > >>> Num new daemons: 0 New daemon starting vpid INVALID > >>> Num nodes: 15 > >>> > >>> Data for node: csclprd3-6-1 Launch id: -1 State: 0 > >>> Daemon: [[51718,0],1] Daemon launched: True > >>> Num slots: 4 Slots in use: 4 Oversubscribed: FALSE > >>> Num slots allocated: 4 Max slots: 0 > >>> Username on node: NULL > >>> Num procs: 4 Next node_rank: 4 > >>> Data for proc: [[51718,1],0] > >>> Pid: 0 Local rank: 0 Node rank: 0 App rank: 0 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [B/B][./.] > >>> Binding: [B/.][./.] > >>> Data for proc: [[51718,1],1] > >>> Pid: 0 Local rank: 1 Node rank: 1 App rank: 1 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [./.][B/B] > >>> Binding: [./.][B/.] > >>> Data for proc: [[51718,1],2] > >>> Pid: 0 Local rank: 2 Node rank: 2 App rank: 2 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [B/B][./.] > >>> Binding: [./B][./.] > >>> Data for proc: [[51718,1],3] > >>> Pid: 0 Local rank: 3 Node rank: 3 App rank: 3 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [./.][B/B] > >>> Binding: [./.][./B] > >>> > >>> Data for node: csclprd3-6-5 Launch id: -1 State: 0 > >>> Daemon: [[51718,0],2] Daemon launched: True > >>> Num slots: 4 Slots in use: 4 Oversubscribed: FALSE > >>> Num slots allocated: 4 Max slots: 0 > >>> Username on node: NULL > >>> Num procs: 4 Next node_rank: 4 > >>> Data for proc: [[51718,1],4] > >>> Pid: 0 Local rank: 0 Node rank: 0 App rank: 4 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [B/B][./.] > >>> Binding: [B/.][./.] > >>> Data for proc: [[51718,1],5] > >>> Pid: 0 Local rank: 1 Node rank: 1 App rank: 5 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [./.][B/B] > >>> Binding: [./.][B/.] > >>> Data for proc: [[51718,1],6] > >>> Pid: 0 Local rank: 2 Node rank: 2 App rank: 6 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [B/B][./.] > >>> Binding: [./B][./.] > >>> Data for proc: [[51718,1],7] > >>> Pid: 0 Local rank: 3 Node rank: 3 App rank: 7 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [./.][B/B] > >>> Binding: [./.][./B] > >>> > >>> Data for node: csclprd3-0-0 Launch id: -1 State: 0 > >>> Daemon: [[51718,0],3] Daemon launched: True > >>> Num slots: 12 Slots in use: 12 Oversubscribed: FALSE > >>> Num slots allocated: 12 Max slots: 0 > >>> Username on node: NULL > >>> Num procs: 12 Next node_rank: 12 > >>> Data for proc: [[51718,1],8] > >>> Pid: 0 Local rank: 0 Node rank: 0 App rank: 8 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [B/B/B/B/B/B][./././././.] > >>> Binding: [B/././././.][./././././.] > >>> Data for proc: [[51718,1],9] > >>> Pid: 0 Local rank: 1 Node rank: 1 App rank: 9 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [./././././.][B/B/B/B/B/B] > >>> Binding: [./././././.][B/././././.] > >>> Data for proc: [[51718,1],10] > >>> Pid: 0 Local rank: 2 Node rank: 2 App rank: 10 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [B/B/B/B/B/B][./././././.] > >>> Binding: [./B/./././.][./././././.] > >>> Data for proc: [[51718,1],11] > >>> Pid: 0 Local rank: 3 Node rank: 3 App rank: 11 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [./././././.][B/B/B/B/B/B] > >>> Binding: [./././././.][./B/./././.] > >>> Data for proc: [[51718,1],12] > >>> Pid: 0 Local rank: 4 Node rank: 4 App rank: 12 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [B/B/B/B/B/B][./././././.] > >>> Binding: [././B/././.][./././././.] > >>> Data for proc: [[51718,1],13] > >>> Pid: 0 Local rank: 5 Node rank: 5 App rank: 13 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [./././././.][B/B/B/B/B/B] > >>> Binding: [./././././.][././B/././.] > >>> Data for proc: [[51718,1],14] > >>> Pid: 0 Local rank: 6 Node rank: 6 App rank: 14 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [B/B/B/B/B/B][./././././.] > >>> Binding: [./././B/./.][./././././.] > >>> Data for proc: [[51718,1],15] > >>> Pid: 0 Local rank: 7 Node rank: 7 App rank: 15 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [./././././.][B/B/B/B/B/B] > >>> Binding: [./././././.][./././B/./.] > >>> Data for proc: [[51718,1],16] > >>> Pid: 0 Local rank: 8 Node rank: 8 App rank: 16 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [B/B/B/B/B/B][./././././.] > >>> Binding: [././././B/.][./././././.] > >>> Data for proc: [[51718,1],17] > >>> Pid: 0 Local rank: 9 Node rank: 9 App rank: 17 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [./././././.][B/B/B/B/B/B] > >>> Binding: [./././././.][././././B/.] > >>> Data for proc: [[51718,1],18] > >>> Pid: 0 Local rank: 10 Node rank: 10 App rank: 18 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [B/B/B/B/B/B][./././././.] > >>> Binding: [./././././B][./././././.] > >>> Data for proc: [[51718,1],19] > >>> Pid: 0 Local rank: 11 Node rank: 11 App rank: 19 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [./././././.][B/B/B/B/B/B] > >>> Binding: [./././././.][./././././B] > >>> > >>> Data for node: csclprd3-0-1 Launch id: -1 State: 0 > >>> Daemon: [[51718,0],4] Daemon launched: True > >>> Num slots: 6 Slots in use: 6 Oversubscribed: FALSE > >>> Num slots allocated: 6 Max slots: 0 > >>> Username on node: NULL > >>> Num procs: 6 Next node_rank: 6 > >>> Data for proc: [[51718,1],20] > >>> Pid: 0 Local rank: 0 Node rank: 0 App rank: 20 > >>> State: INITIALIZED App_context: 0 > >>> Locale: UNKNOWN > >>> Binding: [B/././././.] > >>> Data for proc: [[51718,1],21] > >>> Pid: 0 Local rank: 1 Node rank: 1 App rank: 21 > >>> State: INITIALIZED App_context: 0 > >>> Locale: UNKNOWN > >>> Binding: [./B/./././.] > >>> Data for proc: [[51718,1],22] > >>> Pid: 0 Local rank: 2 Node rank: 2 App rank: 22 > >>> State: INITIALIZED App_context: 0 > >>> Locale: UNKNOWN > >>> Binding: [././B/././.] > >>> Data for proc: [[51718,1],23] > >>> Pid: 0 Local rank: 3 Node rank: 3 App rank: 23 > >>> State: INITIALIZED App_context: 0 > >>> Locale: UNKNOWN > >>> Binding: [./././B/./.] > >>> Data for proc: [[51718,1],24] > >>> Pid: 0 Local rank: 4 Node rank: 4 App rank: 24 > >>> State: INITIALIZED App_context: 0 > >>> Locale: UNKNOWN > >>> Binding: [././././B/.] > >>> Data for proc: [[51718,1],25] > >>> Pid: 0 Local rank: 5 Node rank: 5 App rank: 25 > >>> State: INITIALIZED App_context: 0 > >>> Locale: UNKNOWN > >>> Binding: [./././././B] > >>> > >>> Data for node: csclprd3-0-2 Launch id: -1 State: 0 > >>> Daemon: [[51718,0],5] Daemon launched: True > >>> Num slots: 6 Slots in use: 6 Oversubscribed: FALSE > >>> Num slots allocated: 6 Max slots: 0 > >>> Username on node: NULL > >>> Num procs: 6 Next node_rank: 6 > >>> Data for proc: [[51718,1],26] > >>> Pid: 0 Local rank: 0 Node rank: 0 App rank: 26 > >>> State: INITIALIZED App_context: 0 > >>> Locale: UNKNOWN > >>> Binding: [B/././././.] > >>> Data for proc: [[51718,1],27] > >>> Pid: 0 Local rank: 1 Node rank: 1 App rank: 27 > >>> State: INITIALIZED App_context: 0 > >>> Locale: UNKNOWN > >>> Binding: [./B/./././.] > >>> Data for proc: [[51718,1],28] > >>> Pid: 0 Local rank: 2 Node rank: 2 App rank: 28 > >>> State: INITIALIZED App_context: 0 > >>> Locale: UNKNOWN > >>> Binding: [././B/././.] > >>> Data for proc: [[51718,1],29] > >>> Pid: 0 Local rank: 3 Node rank: 3 App rank: 29 > >>> State: INITIALIZED App_context: 0 > >>> Locale: UNKNOWN > >>> Binding: [./././B/./.] > >>> Data for proc: [[51718,1],30] > >>> Pid: 0 Local rank: 4 Node rank: 4 App rank: 30 > >>> State: INITIALIZED App_context: 0 > >>> Locale: UNKNOWN > >>> Binding: [././././B/.] > >>> Data for proc: [[51718,1],31] > >>> Pid: 0 Local rank: 5 Node rank: 5 App rank: 31 > >>> State: INITIALIZED App_context: 0 > >>> Locale: UNKNOWN > >>> Binding: [./././././B] > >>> > >>> Data for node: csclprd3-0-3 Launch id: -1 State: 0 > >>> Daemon: [[51718,0],6] Daemon launched: True > >>> Num slots: 6 Slots in use: 6 Oversubscribed: FALSE > >>> Num slots allocated: 6 Max slots: 0 > >>> Username on node: NULL > >>> Num procs: 6 Next node_rank: 6 > >>> Data for proc: [[51718,1],32] > >>> Pid: 0 Local rank: 0 Node rank: 0 App rank: 32 > >>> State: INITIALIZED App_context: 0 > >>> Locale: UNKNOWN > >>> Binding: [B/././././.] > >>> Data for proc: [[51718,1],33] > >>> Pid: 0 Local rank: 1 Node rank: 1 App rank: 33 > >>> State: INITIALIZED App_context: 0 > >>> Locale: UNKNOWN > >>> Binding: [./B/./././.] > >>> Data for proc: [[51718,1],34] > >>> Pid: 0 Local rank: 2 Node rank: 2 App rank: 34 > >>> State: INITIALIZED App_context: 0 > >>> Locale: UNKNOWN > >>> Binding: [././B/././.] > >>> Data for proc: [[51718,1],35] > >>> Pid: 0 Local rank: 3 Node rank: 3 App rank: 35 > >>> State: INITIALIZED App_context: 0 > >>> Locale: UNKNOWN > >>> Binding: [./././B/./.] > >>> Data for proc: [[51718,1],36] > >>> Pid: 0 Local rank: 4 Node rank: 4 App rank: 36 > >>> State: INITIALIZED App_context: 0 > >>> Locale: UNKNOWN > >>> Binding: [././././B/.] > >>> Data for proc: [[51718,1],37] > >>> Pid: 0 Local rank: 5 Node rank: 5 App rank: 37 > >>> State: INITIALIZED App_context: 0 > >>> Locale: UNKNOWN > >>> Binding: [./././././B] > >>> > >>> Data for node: csclprd3-0-4 Launch id: -1 State: 0 > >>> Daemon: [[51718,0],7] Daemon launched: True > >>> Num slots: 6 Slots in use: 6 Oversubscribed: FALSE > >>> Num slots allocated: 6 Max slots: 0 > >>> Username on node: NULL > >>> Num procs: 6 Next node_rank: 6 > >>> Data for proc: [[51718,1],38] > >>> Pid: 0 Local rank: 0 Node rank: 0 App rank: 38 > >>> State: INITIALIZED App_context: 0 > >>> Locale: UNKNOWN > >>> Binding: [B/././././.] > >>> Data for proc: [[51718,1],39] > >>> Pid: 0 Local rank: 1 Node rank: 1 App rank: 39 > >>> State: INITIALIZED App_context: 0 > >>> Locale: UNKNOWN > >>> Binding: [./B/./././.] > >>> Data for proc: [[51718,1],40] > >>> Pid: 0 Local rank: 2 Node rank: 2 App rank: 40 > >>> State: INITIALIZED App_context: 0 > >>> Locale: UNKNOWN > >>> Binding: [././B/././.] > >>> Data for proc: [[51718,1],41] > >>> Pid: 0 Local rank: 3 Node rank: 3 App rank: 41 > >>> State: INITIALIZED App_context: 0 > >>> Locale: UNKNOWN > >>> Binding: [./././B/./.] > >>> Data for proc: [[51718,1],42] > >>> Pid: 0 Local rank: 4 Node rank: 4 App rank: 42 > >>> State: INITIALIZED App_context: 0 > >>> Locale: UNKNOWN > >>> Binding: [././././B/.] > >>> Data for proc: [[51718,1],43] > >>> Pid: 0 Local rank: 5 Node rank: 5 App rank: 43 > >>> State: INITIALIZED App_context: 0 > >>> Locale: UNKNOWN > >>> Binding: [./././././B] > >>> > >>> Data for node: csclprd3-0-5 Launch id: -1 State: 0 > >>> Daemon: [[51718,0],8] Daemon launched: True > >>> Num slots: 6 Slots in use: 6 Oversubscribed: FALSE > >>> Num slots allocated: 6 Max slots: 0 > >>> Username on node: NULL > >>> Num procs: 6 Next node_rank: 6 > >>> Data for proc: [[51718,1],44] > >>> Pid: 0 Local rank: 0 Node rank: 0 App rank: 44 > >>> State: INITIALIZED App_context: 0 > >>> Locale: UNKNOWN > >>> Binding: [B/././././.] > >>> Data for proc: [[51718,1],45] > >>> Pid: 0 Local rank: 1 Node rank: 1 App rank: 45 > >>> State: INITIALIZED App_context: 0 > >>> Locale: UNKNOWN > >>> Binding: [./B/./././.] > >>> Data for proc: [[51718,1],46] > >>> Pid: 0 Local rank: 2 Node rank: 2 App rank: 46 > >>> State: INITIALIZED App_context: 0 > >>> Locale: UNKNOWN > >>> Binding: [././B/././.] > >>> Data for proc: [[51718,1],47] > >>> Pid: 0 Local rank: 3 Node rank: 3 App rank: 47 > >>> State: INITIALIZED App_context: 0 > >>> Locale: UNKNOWN > >>> Binding: [./././B/./.] > >>> Data for proc: [[51718,1],48] > >>> Pid: 0 Local rank: 4 Node rank: 4 App rank: 48 > >>> State: INITIALIZED App_context: 0 > >>> Locale: UNKNOWN > >>> Binding: [././././B/.] > >>> Data for proc: [[51718,1],49] > >>> Pid: 0 Local rank: 5 Node rank: 5 App rank: 49 > >>> State: INITIALIZED App_context: 0 > >>> Locale: UNKNOWN > >>> Binding: [./././././B] > >>> > >>> Data for node: csclprd3-0-6 Launch id: -1 State: 0 > >>> Daemon: [[51718,0],9] Daemon launched: True > >>> Num slots: 6 Slots in use: 6 Oversubscribed: FALSE > >>> Num slots allocated: 6 Max slots: 0 > >>> Username on node: NULL > >>> Num procs: 6 Next node_rank: 6 > >>> Data for proc: [[51718,1],50] > >>> Pid: 0 Local rank: 0 Node rank: 0 App rank: 50 > >>> State: INITIALIZED App_context: 0 > >>> Locale: UNKNOWN > >>> Binding: [B/././././.] > >>> Data for proc: [[51718,1],51] > >>> Pid: 0 Local rank: 1 Node rank: 1 App rank: 51 > >>> State: INITIALIZED App_context: 0 > >>> Locale: UNKNOWN > >>> Binding: [./B/./././.] > >>> Data for proc: [[51718,1],52] > >>> Pid: 0 Local rank: 2 Node rank: 2 App rank: 52 > >>> State: INITIALIZED App_context: 0 > >>> Locale: UNKNOWN > >>> Binding: [././B/././.] > >>> Data for proc: [[51718,1],53] > >>> Pid: 0 Local rank: 3 Node rank: 3 App rank: 53 > >>> State: INITIALIZED App_context: 0 > >>> Locale: UNKNOWN > >>> Binding: [./././B/./.] > >>> Data for proc: [[51718,1],54] > >>> Pid: 0 Local rank: 4 Node rank: 4 App rank: 54 > >>> State: INITIALIZED App_context: 0 > >>> Locale: UNKNOWN > >>> Binding: [././././B/.] > >>> Data for proc: [[51718,1],55] > >>> Pid: 0 Local rank: 5 Node rank: 5 App rank: 55 > >>> State: INITIALIZED App_context: 0 > >>> Locale: UNKNOWN > >>> Binding: [./././././B] > >>> > >>> Data for node: csclprd3-0-7 Launch id: -1 State: 0 > >>> Daemon: [[51718,0],10] Daemon launched: True > >>> Num slots: 16 Slots in use: 16 Oversubscribed: FALSE > >>> Num slots allocated: 16 Max slots: 0 > >>> Username on node: NULL > >>> Num procs: 16 Next node_rank: 16 > >>> Data for proc: [[51718,1],56] > >>> Pid: 0 Local rank: 0 Node rank: 0 App rank: 56 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] > >>> Binding: [BB/../../../../../../..][../../../../../../../..] > >>> Data for proc: [[51718,1],57] > >>> Pid: 0 Local rank: 1 Node rank: 1 App rank: 57 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] > >>> Binding: [../../../../../../../..][BB/../../../../../../..] > >>> Data for proc: [[51718,1],58] > >>> Pid: 0 Local rank: 2 Node rank: 2 App rank: 58 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] > >>> Binding: [../BB/../../../../../..][../../../../../../../..] > >>> Data for proc: [[51718,1],59] > >>> Pid: 0 Local rank: 3 Node rank: 3 App rank: 59 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] > >>> Binding: [../../../../../../../..][../BB/../../../../../..] > >>> Data for proc: [[51718,1],60] > >>> Pid: 0 Local rank: 4 Node rank: 4 App rank: 60 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] > >>> Binding: [../../BB/../../../../..][../../../../../../../..] > >>> Data for proc: [[51718,1],61] > >>> Pid: 0 Local rank: 5 Node rank: 5 App rank: 61 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] > >>> Binding: [../../../../../../../..][../../BB/../../../../..] > >>> Data for proc: [[51718,1],62] > >>> Pid: 0 Local rank: 6 Node rank: 6 App rank: 62 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] > >>> Binding: [../../../BB/../../../..][../../../../../../../..] > >>> Data for proc: [[51718,1],63] > >>> Pid: 0 Local rank: 7 Node rank: 7 App rank: 63 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] > >>> Binding: [../../../../../../../..][../../../BB/../../../..] > >>> Data for proc: [[51718,1],64] > >>> Pid: 0 Local rank: 8 Node rank: 8 App rank: 64 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] > >>> Binding: [../../../../BB/../../..][../../../../../../../..] > >>> Data for proc: [[51718,1],65] > >>> Pid: 0 Local rank: 9 Node rank: 9 App rank: 65 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] > >>> Binding: [../../../../../../../..][../../../../BB/../../..] > >>> Data for proc: [[51718,1],66] > >>> Pid: 0 Local rank: 10 Node rank: 10 App rank: 66 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] > >>> Binding: [../../../../../BB/../..][../../../../../../../..] > >>> Data for proc: [[51718,1],67] > >>> Pid: 0 Local rank: 11 Node rank: 11 App rank: 67 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] > >>> Binding: [../../../../../../../..][../../../../../BB/../..] > >>> Data for proc: [[51718,1],68] > >>> Pid: 0 Local rank: 12 Node rank: 12 App rank: 68 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] > >>> Binding: [../../../../../../BB/..][../../../../../../../..] > >>> Data for proc: [[51718,1],69] > >>> Pid: 0 Local rank: 13 Node rank: 13 App rank: 69 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] > >>> Binding: [../../../../../../../..][../../../../../../BB/..] > >>> Data for proc: [[51718,1],70] > >>> Pid: 0 Local rank: 14 Node rank: 14 App rank: 70 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] > >>> Binding: [../../../../../../../BB][../../../../../../../..] > >>> Data for proc: [[51718,1],71] > >>> Pid: 0 Local rank: 15 Node rank: 15 App rank: 71 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] > >>> Binding: [../../../../../../../..][../../../../../../../BB] > >>> > >>> Data for node: csclprd3-0-8 Launch id: -1 State: 0 > >>> Daemon: [[51718,0],11] Daemon launched: True > >>> Num slots: 16 Slots in use: 16 Oversubscribed: FALSE > >>> Num slots allocated: 16 Max slots: 0 > >>> Username on node: NULL > >>> Num procs: 16 Next node_rank: 16 > >>> Data for proc: [[51718,1],72] > >>> Pid: 0 Local rank: 0 Node rank: 0 App rank: 72 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] > >>> Binding: [BB/../../../../../../..][../../../../../../../..] > >>> Data for proc: [[51718,1],73] > >>> Pid: 0 Local rank: 1 Node rank: 1 App rank: 73 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] > >>> Binding: [../../../../../../../..][BB/../../../../../../..] > >>> Data for proc: [[51718,1],74] > >>> Pid: 0 Local rank: 2 Node rank: 2 App rank: 74 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] > >>> Binding: [../BB/../../../../../..][../../../../../../../..] > >>> Data for proc: [[51718,1],75] > >>> Pid: 0 Local rank: 3 Node rank: 3 App rank: 75 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] > >>> Binding: [../../../../../../../..][../BB/../../../../../..] > >>> Data for proc: [[51718,1],76] > >>> Pid: 0 Local rank: 4 Node rank: 4 App rank: 76 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] > >>> Binding: [../../BB/../../../../..][../../../../../../../..] > >>> Data for proc: [[51718,1],77] > >>> Pid: 0 Local rank: 5 Node rank: 5 App rank: 77 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] > >>> Binding: [../../../../../../../..][../../BB/../../../../..] > >>> Data for proc: [[51718,1],78] > >>> Pid: 0 Local rank: 6 Node rank: 6 App rank: 78 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] > >>> Binding: [../../../BB/../../../..][../../../../../../../..] > >>> Data for proc: [[51718,1],79] > >>> Pid: 0 Local rank: 7 Node rank: 7 App rank: 79 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] > >>> Binding: [../../../../../../../..][../../../BB/../../../..] > >>> Data for proc: [[51718,1],80] > >>> Pid: 0 Local rank: 8 Node rank: 8 App rank: 80 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] > >>> Binding: [../../../../BB/../../..][../../../../../../../..] > >>> Data for proc: [[51718,1],81] > >>> Pid: 0 Local rank: 9 Node rank: 9 App rank: 81 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] > >>> Binding: [../../../../../../../..][../../../../BB/../../..] > >>> Data for proc: [[51718,1],82] > >>> Pid: 0 Local rank: 10 Node rank: 10 App rank: 82 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] > >>> Binding: [../../../../../BB/../..][../../../../../../../..] > >>> Data for proc: [[51718,1],83] > >>> Pid: 0 Local rank: 11 Node rank: 11 App rank: 83 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] > >>> Binding: [../../../../../../../..][../../../../../BB/../..] > >>> Data for proc: [[51718,1],84] > >>> Pid: 0 Local rank: 12 Node rank: 12 App rank: 84 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] > >>> Binding: [../../../../../../BB/..][../../../../../../../..] > >>> Data for proc: [[51718,1],85] > >>> Pid: 0 Local rank: 13 Node rank: 13 App rank: 85 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] > >>> Binding: [../../../../../../../..][../../../../../../BB/..] > >>> Data for proc: [[51718,1],86] > >>> Pid: 0 Local rank: 14 Node rank: 14 App rank: 86 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] > >>> Binding: [../../../../../../../BB][../../../../../../../..] > >>> Data for proc: [[51718,1],87] > >>> Pid: 0 Local rank: 15 Node rank: 15 App rank: 87 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] > >>> Binding: [../../../../../../../..][../../../../../../../BB] > >>> > >>> Data for node: csclprd3-0-10 Launch id: -1 State: 0 > >>> Daemon: [[51718,0],12] Daemon launched: True > >>> Num slots: 16 Slots in use: 16 Oversubscribed: FALSE > >>> Num slots allocated: 16 Max slots: 0 > >>> Username on node: NULL > >>> Num procs: 16 Next node_rank: 16 > >>> Data for proc: [[51718,1],88] > >>> Pid: 0 Local rank: 0 Node rank: 0 App rank: 88 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] > >>> Binding: [BB/../../../../../../..][../../../../../../../..] > >>> Data for proc: [[51718,1],89] > >>> Pid: 0 Local rank: 1 Node rank: 1 App rank: 89 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] > >>> Binding: [../../../../../../../..][BB/../../../../../../..] > >>> Data for proc: [[51718,1],90] > >>> Pid: 0 Local rank: 2 Node rank: 2 App rank: 90 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] > >>> Binding: [../BB/../../../../../..][../../../../../../../..] > >>> Data for proc: [[51718,1],91] > >>> Pid: 0 Local rank: 3 Node rank: 3 App rank: 91 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] > >>> Binding: [../../../../../../../..][../BB/../../../../../..] > >>> Data for proc: [[51718,1],92] > >>> Pid: 0 Local rank: 4 Node rank: 4 App rank: 92 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] > >>> Binding: [../../BB/../../../../..][../../../../../../../..] > >>> Data for proc: [[51718,1],93] > >>> Pid: 0 Local rank: 5 Node rank: 5 App rank: 93 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] > >>> Binding: [../../../../../../../..][../../BB/../../../../..] > >>> Data for proc: [[51718,1],94] > >>> Pid: 0 Local rank: 6 Node rank: 6 App rank: 94 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] > >>> Binding: [../../../BB/../../../..][../../../../../../../..] > >>> Data for proc: [[51718,1],95] > >>> Pid: 0 Local rank: 7 Node rank: 7 App rank: 95 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] > >>> Binding: [../../../../../../../..][../../../BB/../../../..] > >>> Data for proc: [[51718,1],96] > >>> Pid: 0 Local rank: 8 Node rank: 8 App rank: 96 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] > >>> Binding: [../../../../BB/../../..][../../../../../../../..] > >>> Data for proc: [[51718,1],97] > >>> Pid: 0 Local rank: 9 Node rank: 9 App rank: 97 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] > >>> Binding: [../../../../../../../..][../../../../BB/../../..] > >>> Data for proc: [[51718,1],98] > >>> Pid: 0 Local rank: 10 Node rank: 10 App rank: 98 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] > >>> Binding: [../../../../../BB/../..][../../../../../../../..] > >>> Data for proc: [[51718,1],99] > >>> Pid: 0 Local rank: 11 Node rank: 11 App rank: 99 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] > >>> Binding: [../../../../../../../..][../../../../../BB/../..] > >>> Data for proc: [[51718,1],100] > >>> Pid: 0 Local rank: 12 Node rank: 12 App rank: 100 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] > >>> Binding: [../../../../../../BB/..][../../../../../../../..] > >>> Data for proc: [[51718,1],101] > >>> Pid: 0 Local rank: 13 Node rank: 13 App rank: 101 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] > >>> Binding: [../../../../../../../..][../../../../../../BB/..] > >>> Data for proc: [[51718,1],102] > >>> Pid: 0 Local rank: 14 Node rank: 14 App rank: 102 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] > >>> Binding: [../../../../../../../BB][../../../../../../../..] > >>> Data for proc: [[51718,1],103] > >>> Pid: 0 Local rank: 15 Node rank: 15 App rank: 103 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] > >>> Binding: [../../../../../../../..][../../../../../../../BB] > >>> > >>> Data for node: csclprd3-0-11 Launch id: -1 State: 0 > >>> Daemon: [[51718,0],13] Daemon launched: True > >>> Num slots: 16 Slots in use: 16 Oversubscribed: FALSE > >>> Num slots allocated: 16 Max slots: 0 > >>> Username on node: NULL > >>> Num procs: 16 Next node_rank: 16 > >>> Data for proc: [[51718,1],104] > >>> Pid: 0 Local rank: 0 Node rank: 0 App rank: 104 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] > >>> Binding: [BB/../../../../../../..][../../../../../../../..] > >>> Data for proc: [[51718,1],105] > >>> Pid: 0 Local rank: 1 Node rank: 1 App rank: 105 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] > >>> Binding: [../../../../../../../..][BB/../../../../../../..] > >>> Data for proc: [[51718,1],106] > >>> Pid: 0 Local rank: 2 Node rank: 2 App rank: 106 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] > >>> Binding: [../BB/../../../../../..][../../../../../../../..] > >>> Data for proc: [[51718,1],107] > >>> Pid: 0 Local rank: 3 Node rank: 3 App rank: 107 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] > >>> Binding: [../../../../../../../..][../BB/../../../../../..] > >>> Data for proc: [[51718,1],108] > >>> Pid: 0 Local rank: 4 Node rank: 4 App rank: 108 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] > >>> Binding: [../../BB/../../../../..][../../../../../../../..] > >>> Data for proc: [[51718,1],109] > >>> Pid: 0 Local rank: 5 Node rank: 5 App rank: 109 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] > >>> Binding: [../../../../../../../..][../../BB/../../../../..] > >>> Data for proc: [[51718,1],110] > >>> Pid: 0 Local rank: 6 Node rank: 6 App rank: 110 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] > >>> Binding: [../../../BB/../../../..][../../../../../../../..] > >>> Data for proc: [[51718,1],111] > >>> Pid: 0 Local rank: 7 Node rank: 7 App rank: 111 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] > >>> Binding: [../../../../../../../..][../../../BB/../../../..] > >>> Data for proc: [[51718,1],112] > >>> Pid: 0 Local rank: 8 Node rank: 8 App rank: 112 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] > >>> Binding: [../../../../BB/../../..][../../../../../../../..] > >>> Data for proc: [[51718,1],113] > >>> Pid: 0 Local rank: 9 Node rank: 9 App rank: 113 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] > >>> Binding: [../../../../../../../..][../../../../BB/../../..] > >>> Data for proc: [[51718,1],114] > >>> Pid: 0 Local rank: 10 Node rank: 10 App rank: 114 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] > >>> Binding: [../../../../../BB/../..][../../../../../../../..] > >>> Data for proc: [[51718,1],115] > >>> Pid: 0 Local rank: 11 Node rank: 11 App rank: 115 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] > >>> Binding: [../../../../../../../..][../../../../../BB/../..] > >>> Data for proc: [[51718,1],116] > >>> Pid: 0 Local rank: 12 Node rank: 12 App rank: 116 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] > >>> Binding: [../../../../../../BB/..][../../../../../../../..] > >>> Data for proc: [[51718,1],117] > >>> Pid: 0 Local rank: 13 Node rank: 13 App rank: 117 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] > >>> Binding: [../../../../../../../..][../../../../../../BB/..] > >>> Data for proc: [[51718,1],118] > >>> Pid: 0 Local rank: 14 Node rank: 14 App rank: 118 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] > >>> Binding: [../../../../../../../BB][../../../../../../../..] > >>> Data for proc: [[51718,1],119] > >>> Pid: 0 Local rank: 15 Node rank: 15 App rank: 119 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] > >>> Binding: [../../../../../../../..][../../../../../../../BB] > >>> > >>> Data for node: csclprd3-0-12 Launch id: -1 State: 0 > >>> Daemon: [[51718,0],14] Daemon launched: True > >>> Num slots: 6 Slots in use: 6 Oversubscribed: FALSE > >>> Num slots allocated: 6 Max slots: 0 > >>> Username on node: NULL > >>> Num procs: 6 Next node_rank: 6 > >>> Data for proc: [[51718,1],120] > >>> Pid: 0 Local rank: 0 Node rank: 0 App rank: 120 > >>> State: INITIALIZED App_context: 0 > >>> Locale: UNKNOWN > >>> Binding: [BB/../../../../..] > >>> Data for proc: [[51718,1],121] > >>> Pid: 0 Local rank: 1 Node rank: 1 App rank: 121 > >>> State: INITIALIZED App_context: 0 > >>> Locale: UNKNOWN > >>> Binding: [../BB/../../../..] > >>> Data for proc: [[51718,1],122] > >>> Pid: 0 Local rank: 2 Node rank: 2 App rank: 122 > >>> State: INITIALIZED App_context: 0 > >>> Locale: UNKNOWN > >>> Binding: [../../BB/../../..] > >>> Data for proc: [[51718,1],123] > >>> Pid: 0 Local rank: 3 Node rank: 3 App rank: 123 > >>> State: INITIALIZED App_context: 0 > >>> Locale: UNKNOWN > >>> Binding: [../../../BB/../..] > >>> Data for proc: [[51718,1],124] > >>> Pid: 0 Local rank: 4 Node rank: 4 App rank: 124 > >>> State: INITIALIZED App_context: 0 > >>> Locale: UNKNOWN > >>> Binding: [../../../../BB/..] > >>> Data for proc: [[51718,1],125] > >>> Pid: 0 Local rank: 5 Node rank: 5 App rank: 125 > >>> State: INITIALIZED App_context: 0 > >>> Locale: UNKNOWN > >>> Binding: [../../../../../BB] > >>> > >>> Data for node: csclprd3-0-13 Launch id: -1 State: 0 > >>> Daemon: [[51718,0],15] Daemon launched: True > >>> Num slots: 12 Slots in use: 6 Oversubscribed: FALSE > >>> Num slots allocated: 12 Max slots: 0 > >>> Username on node: NULL > >>> Num procs: 6 Next node_rank: 6 > >>> Data for proc: [[51718,1],126] > >>> Pid: 0 Local rank: 0 Node rank: 0 App rank: 126 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [BB/BB/BB/BB/BB/BB][../../../../../..] > >>> Binding: [BB/../../../../..][../../../../../..] > >>> Data for proc: [[51718,1],127] > >>> Pid: 0 Local rank: 1 Node rank: 1 App rank: 127 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [../../../../../..][BB/BB/BB/BB/BB/BB] > >>> Binding: [../../../../../..][BB/../../../../..] > >>> Data for proc: [[51718,1],128] > >>> Pid: 0 Local rank: 2 Node rank: 2 App rank: 128 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [BB/BB/BB/BB/BB/BB][../../../../../..] > >>> Binding: [../BB/../../../..][../../../../../..] > >>> Data for proc: [[51718,1],129] > >>> Pid: 0 Local rank: 3 Node rank: 3 App rank: 129 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [../../../../../..][BB/BB/BB/BB/BB/BB] > >>> Binding: [../../../../../..][../BB/../../../..] > >>> Data for proc: [[51718,1],130] > >>> Pid: 0 Local rank: 4 Node rank: 4 App rank: 130 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [BB/BB/BB/BB/BB/BB][../../../../../..] > >>> Binding: [../../BB/../../..][../../../../../..] > >>> Data for proc: [[51718,1],131] > >>> Pid: 0 Local rank: 5 Node rank: 5 App rank: 131 > >>> State: INITIALIZED App_context: 0 > >>> Locale: [../../../../../..][BB/BB/BB/BB/BB/BB] > >>> Binding: [../../../../../..][../../BB/../../..] > >>> [csclprd3-0-13:31619] *** Process received signal *** > >>> [csclprd3-0-13:31619] Signal: Bus error (7) > >>> [csclprd3-0-13:31619] Signal code: Non-existant physical address (2) > >>> [csclprd3-0-13:31619] Failing at address: 0x7f1374267a00 > >>> [csclprd3-0-13:31620] *** Process received signal *** > >>> [csclprd3-0-13:31620] Signal: Bus error (7) > >>> [csclprd3-0-13:31620] Signal code: Non-existant physical address (2) > >>> [csclprd3-0-13:31620] Failing at address: 0x7fcc702a7980 > >>> [csclprd3-0-13:31615] *** Process received signal *** > >>> [csclprd3-0-13:31615] Signal: Bus error (7) > >>> [csclprd3-0-13:31615] Signal code: Non-existant physical address (2) > >>> [csclprd3-0-13:31615] Failing at address: 0x7f8128367880 > >>> [csclprd3-0-13:31616] *** Process received signal *** > >>> [csclprd3-0-13:31616] Signal: Bus error (7) > >>> [csclprd3-0-13:31616] Signal code: Non-existant physical address (2) > >>> [csclprd3-0-13:31616] Failing at address: 0x7fe674227a00 > >>> [csclprd3-0-13:31617] *** Process received signal *** > >>> [csclprd3-0-13:31617] Signal: Bus error (7) > >>> [csclprd3-0-13:31617] Signal code: Non-existant physical address (2) > >>> [csclprd3-0-13:31617] Failing at address: 0x7f061c32db80 > >>> [csclprd3-0-13:31618] *** Process received signal *** > >>> [csclprd3-0-13:31618] Signal: Bus error (7) > >>> [csclprd3-0-13:31618] Signal code: Non-existant physical address (2) > >>> [csclprd3-0-13:31618] Failing at address: 0x7fb8402eaa80 > >>> [csclprd3-0-13:31618] [ 0] /lib64/libpthread.so.0(+0xf500)[0x7fb851851500] > >>> [csclprd3-0-13:31618] [ 1] [csclprd3-0-13:31616] [ 0] > >>> /lib64/libpthread.so.0(+0xf500)[0x7fe6843a4500] > >>> [csclprd3-0-13:31616] [ 1] [csclprd3-0-13:31620] [ 0] > >>> /lib64/libpthread.so.0(+0xf500)[0x7fcc80c54500] > >>> [csclprd3-0-13:31620] [ 1] > >>> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x167f61)[0x7fcc80fc9f61] > >>> [csclprd3-0-13:31620] [ 2] > >>> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x168047)[0x7fcc80fca047] > >>> [csclprd3-0-13:31620] [ 3] [csclprd3-0-13:31615] [ 0] > >>> /lib64/libpthread.so.0(+0xf500)[0x7f81385ca500] > >>> [csclprd3-0-13:31615] [ 1] > >>> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x167f61)[0x7f813893ff61] > >>> [csclprd3-0-13:31615] [ 2] > >>> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x168047)[0x7f8138940047] > >>> [csclprd3-0-13:31615] [ 3] > >>> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x167f61)[0x7fb851bc6f61] > >>> [csclprd3-0-13:31618] [ 2] > >>> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x168047)[0x7fb851bc7047] > >>> [csclprd3-0-13:31618] [ 3] > >>> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x55670)[0x7fb851ab4670] > >>> [csclprd3-0-13:31618] [ 4] [csclprd3-0-13:31617] [ 0] > >>> /lib64/libpthread.so.0(+0xf500)[0x7f062cfe5500] > >>> [csclprd3-0-13:31617] [ 1] > >>> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x167f61)[0x7f062d35af61] > >>> [csclprd3-0-13:31617] [ 2] > >>> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x168047)[0x7f062d35b047] > >>> [csclprd3-0-13:31617] [ 3] > >>> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x55670)[0x7f062d248670] > >>> [csclprd3-0-13:31617] [ 4] [csclprd3-0-13:31619] [ 0] > >>> /lib64/libpthread.so.0(+0xf500)[0x7f1384fde500] > >>> [csclprd3-0-13:31619] [ 1] > >>> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x167f61)[0x7f1385353f61] > >>> [csclprd3-0-13:31619] [ 2] > >>> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x167f61)[0x7fe684719f61] > >>> [csclprd3-0-13:31616] [ 2] > >>> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x168047)[0x7fe68471a047] > >>> [csclprd3-0-13:31616] [ 3] > >>> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x55670)[0x7fe684607670] > >>> [csclprd3-0-13:31616] [ 4] > >>> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x168047)[0x7f1385354047] > >>> [csclprd3-0-13:31619] [ 3] > >>> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x55670)[0x7f1385241670] > >>> [csclprd3-0-13:31619] [ 4] > >>> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(ompi_free_list_grow+0x3b9)[0x7f13852425ab] > >>> [csclprd3-0-13:31619] [ 5] > >>> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(ompi_free_list_resize_mt+0xfb)[0x7f1385242751] > >>> [csclprd3-0-13:31619] [ 6] > >>> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(mca_btl_sm_add_procs+0x671)[0x7f13853501c9] > >>> [csclprd3-0-13:31619] [ 7] > >>> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x14a628)[0x7f1385336628] > >>> [csclprd3-0-13:31619] [ 8] > >>> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x55670)[0x7fcc80eb7670] > >>> [csclprd3-0-13:31620] [ 4] > >>> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(ompi_free_list_grow+0x3b9)[0x7fcc80eb85ab] > >>> [csclprd3-0-13:31620] [ 5] > >>> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(ompi_free_list_resize_mt+0xfb)[0x7fcc80eb8751] > >>> [csclprd3-0-13:31620] [ 6] > >>> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(mca_btl_sm_add_procs+0x671)[0x7fcc80fc61c9] > >>> [csclprd3-0-13:31620] [ 7] > >>> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x14a628)[0x7fcc80fac628] > >>> [csclprd3-0-13:31620] [ 8] > >>> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(mca_pml_ob1_add_procs+0xff)[0x7fcc8111fd61] > >>> [csclprd3-0-13:31620] [ 9] > >>> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x55670)[0x7f813882d670] > >>> [csclprd3-0-13:31615] [ 4] > >>> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(ompi_free_list_grow+0x3b9)[0x7f813882e5ab] > >>> [csclprd3-0-13:31615] [ 5] > >>> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(ompi_free_list_resize_mt+0xfb)[0x7f813882e751] > >>> [csclprd3-0-13:31615] [ 6] > >>> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(mca_btl_sm_add_procs+0x671)[0x7f813893c1c9] > >>> [csclprd3-0-13:31615] [ 7] > >>> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x14a628)[0x7f8138922628] > >>> [csclprd3-0-13:31615] [ 8] > >>> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(mca_pml_ob1_add_procs+0xff)[0x7f8138a95d61] > >>> [csclprd3-0-13:31615] [ 9] > >>> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(ompi_mpi_init+0xbda)[0x7f813885d747] > >>> [csclprd3-0-13:31615] [10] > >>> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(ompi_free_list_grow+0x3b9)[0x7fb851ab55ab] > >>> [csclprd3-0-13:31618] [ 5] > >>> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(ompi_free_list_resize_mt+0xfb)[0x7fb851ab5751] > >>> [csclprd3-0-13:31618] [ 6] > >>> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(mca_btl_sm_add_procs+0x671)[0x7fb851bc31c9] > >>> [csclprd3-0-13:31618] [ 7] > >>> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x14a628)[0x7fb851ba9628] > >>> [csclprd3-0-13:31618] [ 8] > >>> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(mca_pml_ob1_add_procs+0xff)[0x7fb851d1cd61] > >>> [csclprd3-0-13:31618] [ 9] > >>> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(ompi_mpi_init+0xbda)[0x7fb851ae4747] > >>> [csclprd3-0-13:31618] [10] > >>> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(ompi_free_list_grow+0x3b9)[0x7f062d2495ab] > >>> [csclprd3-0-13:31617] [ 5] > >>> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(ompi_free_list_resize_mt+0xfb)[0x7f062d249751] > >>> [csclprd3-0-13:31617] [ 6] > >>> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(mca_btl_sm_add_procs+0x671)[0x7f062d3571c9] > >>> [csclprd3-0-13:31617] [ 7] > >>> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x14a628)[0x7f062d33d628] > >>> [csclprd3-0-13:31617] [ 8] > >>> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(mca_pml_ob1_add_procs+0xff)[0x7f062d4b0d61] > >>> [csclprd3-0-13:31617] [ 9] > >>> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(ompi_mpi_init+0xbda)[0x7f062d278747] > >>> [csclprd3-0-13:31617] [10] > >>> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(ompi_free_list_grow+0x3b9)[0x7fe6846085ab] > >>> [csclprd3-0-13:31616] [ 5] > >>> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(ompi_free_list_resize_mt+0xfb)[0x7fe684608751] > >>> [csclprd3-0-13:31616] [ 6] > >>> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(mca_btl_sm_add_procs+0x671)[0x7fe6847161c9] > >>> [csclprd3-0-13:31616] [ 7] > >>> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x14a628)[0x7fe6846fc628] > >>> [csclprd3-0-13:31616] [ 8] > >>> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(mca_pml_ob1_add_procs+0xff)[0x7fe68486fd61] > >>> [csclprd3-0-13:31616] [ 9] > >>> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(ompi_mpi_init+0xbda)[0x7fe684637747] > >>> [csclprd3-0-13:31616] [10] > >>> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(MPI_Init+0x185)[0x7fe68467750b] > >>> [csclprd3-0-13:31616] [11] > >>> /hpc/home/lanew/mpi/openmpi/ProcessColors3[0x400ad0] > >>> [csclprd3-0-13:31616] [12] > >>> /lib64/libc.so.6(__libc_start_main+0xfd)[0x7fe684021cdd] > >>> [csclprd3-0-13:31616] [13] > >>> /hpc/home/lanew/mpi/openmpi/ProcessColors3[0x400999] > >>> [csclprd3-0-13:31616] *** End of error message *** > >>> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(MPI_Init+0x185)[0x7f062d2b850b] > >>> [csclprd3-0-13:31617] [11] > >>> /hpc/home/lanew/mpi/openmpi/ProcessColors3[0x400ad0] > >>> [csclprd3-0-13:31617] [12] > >>> /lib64/libc.so.6(__libc_start_main+0xfd)[0x7f062cc62cdd] > >>> [csclprd3-0-13:31617] [13] > >>> /hpc/home/lanew/mpi/openmpi/ProcessColors3[0x400999] > >>> [csclprd3-0-13:31617] *** End of error message *** > >>> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(mca_pml_ob1_add_procs+0xff)[0x7f13854a9d61] > >>> [csclprd3-0-13:31619] [ 9] > >>> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(ompi_mpi_init+0xbda)[0x7f1385271747] > >>> [csclprd3-0-13:31619] [10] > >>> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(MPI_Init+0x185)[0x7f13852b150b] > >>> [csclprd3-0-13:31619] [11] > >>> /hpc/home/lanew/mpi/openmpi/ProcessColors3[0x400ad0] > >>> [csclprd3-0-13:31619] [12] > >>> /lib64/libc.so.6(__libc_start_main+0xfd)[0x7f1384c5bcdd] > >>> [csclprd3-0-13:31619] [13] > >>> /hpc/home/lanew/mpi/openmpi/ProcessColors3[0x400999] > >>> [csclprd3-0-13:31619] *** End of error message *** > >>> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(ompi_mpi_init+0xbda)[0x7fcc80ee7747] > >>> [csclprd3-0-13:31620] [10] > >>> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(MPI_Init+0x185)[0x7fcc80f2750b] > >>> [csclprd3-0-13:31620] [11] > >>> /hpc/home/lanew/mpi/openmpi/ProcessColors3[0x400ad0] > >>> [csclprd3-0-13:31620] [12] > >>> /lib64/libc.so.6(__libc_start_main+0xfd)[0x7fcc808d1cdd] > >>> [csclprd3-0-13:31620] [13] > >>> /hpc/home/lanew/mpi/openmpi/ProcessColors3[0x400999] > >>> [csclprd3-0-13:31620] *** End of error message *** > >>> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(MPI_Init+0x185)[0x7f813889d50b] > >>> [csclprd3-0-13:31615] [11] > >>> /hpc/home/lanew/mpi/openmpi/ProcessColors3[0x400ad0] > >>> [csclprd3-0-13:31615] [12] > >>> /lib64/libc.so.6(__libc_start_main+0xfd)[0x7f8138247cdd] > >>> [csclprd3-0-13:31615] [13] > >>> /hpc/home/lanew/mpi/openmpi/ProcessColors3[0x400999] > >>> [csclprd3-0-13:31615] *** End of error message *** > >>> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(MPI_Init+0x185)[0x7fb851b2450b] > >>> [csclprd3-0-13:31618] [11] > >>> /hpc/home/lanew/mpi/openmpi/ProcessColors3[0x400ad0] > >>> [csclprd3-0-13:31618] [12] > >>> /lib64/libc.so.6(__libc_start_main+0xfd)[0x7fb8514cecdd] > >>> [csclprd3-0-13:31618] [13] > >>> /hpc/home/lanew/mpi/openmpi/ProcessColors3[0x400999] > >>> [csclprd3-0-13:31618] *** End of error message *** > >>> -------------------------------------------------------------------------- > >>> mpirun noticed that process rank 126 with PID 0 on node csclprd3-0-13 > >>> exited on signal 7 (Bus error). > >>> -------------------------------------------------------------------------- > >>> > >>> From: users [users-boun...@open-mpi.org > >>> <mailto:users-boun...@open-mpi.org>] on behalf of Ralph Castain > >>> [r...@open-mpi.org <mailto:r...@open-mpi.org>] > >>> Sent: Tuesday, June 23, 2015 6:20 PM > >>> To: Open MPI Users > >>> Subject: Re: [OMPI users] OpenMPI 1.8.6, CentOS 6.3, too many slots = > >>> crash > >>> > >>> Wow - that is one sick puppy! I see that some nodes are reporting > >>> not-bound for their procs, and the rest are binding to socket (as they > >>> should). Some of your nodes clearly do not have hyper threads enabled (or > >>> only have single-thread cores on them), and have 2 cores/socket. Other > >>> nodes have 8 cores/socket with hyper threads enabled, while still others > >>> have 6 cores/socket and HT enabled. > >>> > >>> I don't see anyone binding to a single HT if multiple HTs/core are > >>> available. I think you are being fooled by those nodes that either don't > >>> have HT enabled, or have only 1 HT/core. > >>> > >>> In both cases, it is node 13 that is the node that fails. I also note > >>> that you said everything works okay with < 132 ranks, and node 13 hosts > >>> ranks 127-131. So node 13 would host ranks even if you reduced the number > >>> in the job to 131. This would imply that it probably isn't something > >>> wrong with the node itself. > >>> > >>> Is there any way you could run a job of this size on a homogeneous > >>> cluster? The procs all show bindings that look right, but I'm wondering > >>> if the heterogeneity is the source of the trouble here. We may be > >>> communicating the binding pattern incorrectly and giving bad info to the > >>> backend daemon. > >>> > >>> Also, rather than --report-bindings, use "--display-devel-map" on the > >>> command line and let's see what the mapper thinks it did. If there is a > >>> problem with placement, that is where it would exist. > >>> > >>> > >>> On Tue, Jun 23, 2015 at 5:12 PM, Lane, William <william.l...@cshs.org > >>> <mailto:william.l...@cshs.org>> wrote: > >>> Ralph, > >>> > >>> There is something funny going on, the trace from the > >>> runs w/the debug build aren't showing any differences from > >>> what I got earlier. However, I did do a run w/the --bind-to core > >>> switch and was surprised to see that hyperthreading cores were > >>> sometimes being used. > >>> > >>> Here's the traces that I have: > >>> > >>> mpirun -np 132 -report-bindings --prefix /hpc/apps/mpi/openmpi/1.8.6/ > >>> --hostfile hostfile-noslots --mca btl_tcp_if_include eth0 --hetero-nodes > >>> /hpc/home/lanew/mpi/openmpi/ProcessColors3 > >>> [csclprd3-0-5:16802] MCW rank 44 is not bound (or bound to all available > >>> processors) > >>> [csclprd3-0-5:16802] MCW rank 45 is not bound (or bound to all available > >>> processors) > >>> [csclprd3-0-5:16802] MCW rank 46 is not bound (or bound to all available > >>> processors) > >>> [csclprd3-6-5:12480] MCW rank 4 bound to socket 0[core 0[hwt 0]], socket > >>> 0[core 1[hwt 0]]: [B/B][./.] > >>> [csclprd3-6-5:12480] MCW rank 5 bound to socket 1[core 2[hwt 0]], socket > >>> 1[core 3[hwt 0]]: [./.][B/B] > >>> [csclprd3-6-5:12480] MCW rank 6 bound to socket 0[core 0[hwt 0]], socket > >>> 0[core 1[hwt 0]]: [B/B][./.] > >>> [csclprd3-6-5:12480] MCW rank 7 bound to socket 1[core 2[hwt 0]], socket > >>> 1[core 3[hwt 0]]: [./.][B/B] > >>> [csclprd3-0-5:16802] MCW rank 47 is not bound (or bound to all available > >>> processors) > >>> [csclprd3-0-5:16802] MCW rank 48 is not bound (or bound to all available > >>> processors) > >>> [csclprd3-0-5:16802] MCW rank 49 is not bound (or bound to all available > >>> processors) > >>> [csclprd3-0-1:14318] MCW rank 22 is not bound (or bound to all available > >>> processors) > >>> [csclprd3-0-1:14318] MCW rank 23 is not bound (or bound to all available > >>> processors) > >>> [csclprd3-0-1:14318] MCW rank 24 is not bound (or bound to all available > >>> processors) > >>> [csclprd3-6-1:24682] MCW rank 3 bound to socket 1[core 2[hwt 0]], socket > >>> 1[core 3[hwt 0]]: [./.][B/B] > >>> [csclprd3-6-1:24682] MCW rank 0 bound to socket 0[core 0[hwt 0]], socket > >>> 0[core 1[hwt 0]]: [B/B][./.] > >>> [csclprd3-0-1:14318] MCW rank 25 is not bound (or bound to all available > >>> processors) > >>> [csclprd3-0-1:14318] MCW rank 20 is not bound (or bound to all available > >>> processors) > >>> [csclprd3-0-3:13827] MCW rank 34 is not bound (or bound to all available > >>> processors) > >>> [csclprd3-0-1:14318] MCW rank 21 is not bound (or bound to all available > >>> processors) > >>> [csclprd3-0-3:13827] MCW rank 35 is not bound (or bound to all available > >>> processors) > >>> [csclprd3-6-1:24682] MCW rank 1 bound to socket 1[core 2[hwt 0]], socket > >>> 1[core 3[hwt 0]]: [./.][B/B] > >>> [csclprd3-0-3:13827] MCW rank 36 is not bound (or bound to all available > >>> processors) > >>> [csclprd3-6-1:24682] MCW rank 2 bound to socket 0[core 0[hwt 0]], socket > >>> 0[core 1[hwt 0]]: [B/B][./.] > >>> [csclprd3-0-6:30371] MCW rank 51 is not bound (or bound to all available > >>> processors) > >>> [csclprd3-0-6:30371] MCW rank 52 is not bound (or bound to all available > >>> processors) > >>> [csclprd3-0-6:30371] MCW rank 53 is not bound (or bound to all available > >>> processors) > >>> [csclprd3-0-2:05825] MCW rank 30 is not bound (or bound to all available > >>> processors) > >>> [csclprd3-0-6:30371] MCW rank 54 is not bound (or bound to all available > >>> processors) > >>> [csclprd3-0-3:13827] MCW rank 37 is not bound (or bound to all available > >>> processors) > >>> [csclprd3-0-2:05825] MCW rank 31 is not bound (or bound to all available > >>> processors) > >>> [csclprd3-0-3:13827] MCW rank 32 is not bound (or bound to all available > >>> processors) > >>> [csclprd3-0-6:30371] MCW rank 55 is not bound (or bound to all available > >>> processors) > >>> [csclprd3-0-3:13827] MCW rank 33 is not bound (or bound to all available > >>> processors) > >>> [csclprd3-0-6:30371] MCW rank 50 is not bound (or bound to all available > >>> processors) > >>> [csclprd3-0-2:05825] MCW rank 26 is not bound (or bound to all available > >>> processors) > >>> [csclprd3-0-2:05825] MCW rank 27 is not bound (or bound to all available > >>> processors) > >>> [csclprd3-0-2:05825] MCW rank 28 is not bound (or bound to all available > >>> processors) > >>> [csclprd3-0-2:05825] MCW rank 29 is not bound (or bound to all available > >>> processors) > >>> [csclprd3-0-12:12383] MCW rank 121 is not bound (or bound to all > >>> available processors) > >>> [csclprd3-0-12:12383] MCW rank 122 is not bound (or bound to all > >>> available processors) > >>> [csclprd3-0-12:12383] MCW rank 123 is not bound (or bound to all > >>> available processors) > >>> [csclprd3-0-12:12383] MCW rank 124 is not bound (or bound to all > >>> available processors) > >>> [csclprd3-0-12:12383] MCW rank 125 is not bound (or bound to all > >>> available processors) > >>> [csclprd3-0-12:12383] MCW rank 120 is not bound (or bound to all > >>> available processors) > >>> [csclprd3-0-0:31079] MCW rank 13 bound to socket 1[core 6[hwt 0]], socket > >>> 1[core 7[hwt 0]], socket 1[core 8[hwt 0]], socket 1[core 9[hwt 0]], > >>> socket 1[core 10[hwt 0]], socket 1[core 11[hwt 0]]: > >>> [./././././.][B/B/B/B/B/B] > >>> [csclprd3-0-0:31079] MCW rank 14 bound to socket 0[core 0[hwt 0]], socket > >>> 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], socket 0[core 3[hwt 0]], > >>> socket 0[core 4[hwt 0]], socket 0[core 5[hwt 0]]: > >>> [B/B/B/B/B/B][./././././.] > >>> [csclprd3-0-0:31079] MCW rank 15 bound to socket 1[core 6[hwt 0]], socket > >>> 1[core 7[hwt 0]], socket 1[core 8[hwt 0]], socket 1[core 9[hwt 0]], > >>> socket 1[core 10[hwt 0]], socket 1[core 11[hwt 0]]: > >>> [./././././.][B/B/B/B/B/B] > >>> [csclprd3-0-0:31079] MCW rank 16 bound to socket 0[core 0[hwt 0]], socket > >>> 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], socket 0[core 3[hwt 0]], > >>> socket 0[core 4[hwt 0]], socket 0[core 5[hwt 0]]: > >>> [B/B/B/B/B/B][./././././.] > >>> [csclprd3-0-7:20515] MCW rank 68 bound to socket 0[core 0[hwt 0-1]], > >>> socket 0[core 1[hwt 0-1]], socket 0[core 2[hwt 0-1]], socket 0[core 3[hwt > >>> 0-1]], socket 0[core 4[hwt 0-1]], socket 0[core 5[hwt 0-1]], socket > >>> 0[core 6[hwt 0-1]], socket 0[core 7[hwt 0-1]]: > >>> [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] > >>> [csclprd3-0-10:19096] MCW rank 100 bound to socket 0[core 0[hwt 0-1]], > >>> socket 0[core 1[hwt 0-1]], socket 0[core 2[hwt 0-1]], socket 0[core 3[hwt > >>> 0-1]], socket 0[core 4[hwt 0-1]], socket 0[core 5[hwt 0-1]], socket > >>> 0[core 6[hwt 0-1]], socket 0[core 7[hwt 0-1]]: > >>> [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] > >>> [csclprd3-0-7:20515] MCW rank 69 bound to socket 1[core 8[hwt 0-1]], > >>> socket 1[core 9[hwt 0-1]], socket 1[core 10[hwt 0-1]], socket 1[core > >>> 11[hwt 0-1]], socket 1[core 12[hwt 0-1]], socket 1[core 13[hwt 0-1]], > >>> socket 1[core 14[hwt 0-1]], socket 1[core 15[hwt 0-1]]: > >>> [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] > >>> [csclprd3-0-10:19096] MCW rank 101 bound to socket 1[core 8[hwt 0-1]], > >>> socket 1[core 9[hwt 0-1]], socket 1[core 10[hwt 0-1]], socket 1[core > >>> 11[hwt 0-1]], socket 1[core 12[hwt 0-1]], socket 1[core 13[hwt 0-1]], > >>> socket 1[core 14[hwt 0-1]], socket 1[core 15[hwt 0-1]]: > >>> [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] > >>> [csclprd3-0-0:31079] MCW rank 17 bound to socket 1[core 6[hwt 0]], socket > >>> 1[core 7[hwt 0]], socket 1[core 8[hwt 0]], socket 1[core 9[hwt 0]], > >>> socket 1[core 10[hwt 0]], socket 1[core 11[hwt 0]]: > >>> [./././././.][B/B/B/B/B/B] > >>> [csclprd3-0-7:20515] MCW rank 70 bound to socket 0[core 0[hwt 0-1]], > >>> socket 0[core 1[hwt 0-1]], socket 0[core 2[hwt 0-1]], socket 0[core 3[hwt > >>> 0-1]], socket 0[core 4[hwt 0-1]], socket 0[core 5[hwt 0-1]], socket > >>> 0[core 6[hwt 0-1]], socket 0[core 7[hwt 0-1]]: > >>> [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] > >>> [csclprd3-0-10:19096] MCW rank 102 bound to socket 0[core 0[hwt 0-1]], > >>> socket 0[core 1[hwt 0-1]], socket 0[core 2[hwt 0-1]], socket 0[core 3[hwt > >>> 0-1]], socket 0[core 4[hwt 0-1]], socket 0[core 5[hwt 0-1]], socket > >>> 0[core 6[hwt 0-1]], socket 0[core 7[hwt 0-1]]: > >>> [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] > >>> [csclprd3-0-11:31636] MCW rank 116 bound to socket 0[core 0[hwt 0-1]], > >>> socket 0[core 1[hwt 0-1]], socket 0[core 2[hwt 0-1]], socket 0[core 3[hwt > >>> 0-1]], socket 0[core 4[hwt 0-1]], socket 0[core 5[hwt 0-1]], socket > >>> 0[core 6[hwt 0-1]], socket 0[core 7[hwt 0-1]]: > >>> [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] > >>> [csclprd3-0-11:31636] MCW rank 117 bound to socket 1[core 8[hwt 0-1]], > >>> socket 1[core 9[hwt 0-1]], socket 1[core 10[hwt 0-1]], socket 1[core > >>> 11[hwt 0-1]], socket 1[core 12[hwt 0-1]], socket 1[core 13[hwt 0-1]], > >>> socket 1[core 14[hwt 0-1]], socket 1[core 15[hwt 0-1]]: > >>> [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] > >>> [csclprd3-0-0:31079] MCW rank 18 bound to socket 0[core 0[hwt 0]], socket > >>> 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], socket 0[core 3[hwt 0]], > >>> socket 0[core 4[hwt 0]], socket 0[core 5[hwt 0]]: > >>> [B/B/B/B/B/B][./././././.] > >>> [csclprd3-0-11:31636] MCW rank 118 bound to socket 0[core 0[hwt 0-1]], > >>> socket 0[core 1[hwt 0-1]], socket 0[core 2[hwt 0-1]], socket 0[core 3[hwt > >>> 0-1]], socket 0[core 4[hwt 0-1]], socket 0[core 5[hwt 0-1]], socket > >>> 0[core 6[hwt 0-1]], socket 0[core 7[hwt 0-1]]: > >>> [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] > >>> [csclprd3-0-0:31079] MCW rank 19 bound to socket 1[core 6[hwt 0]], socket > >>> 1[core 7[hwt 0]], socket 1[core 8[hwt 0]], socket 1[core 9[hwt 0]], > >>> socket 1[core 10[hwt 0]], socket 1[core 11[hwt 0]]: > >>> [./././././.][B/B/B/B/B/B] > >>> [csclprd3-0-7:20515] MCW rank 71 bound to socket 1[core 8[hwt 0-1]], > >>> socket 1[core 9[hwt 0-1]], socket 1[core 10[hwt 0-1]], socket 1[core > >>> 11[hwt 0-1]], socket 1[core 12[hwt 0-1]], socket 1[core 13[hwt 0-1]], > >>> socket 1[core 14[hwt 0-1]], socket 1[core 15[hwt 0-1]]: > >>> [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] > >>> [csclprd3-0-10:19096] MCW rank 103 bound to socket 1[core 8[hwt 0-1]], > >>> socket 1[core 9[hwt 0-1]], socket 1[core 10[hwt 0-1]], socket 1[core > >>> 11[hwt 0-1]], socket 1[core 12[hwt 0-1]], socket 1[core 13[hwt 0-1]], > >>> socket 1[core 14[hwt 0-1]], socket 1[core 15[hwt 0-1]]: > >>> [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] > >>> [csclprd3-0-0:31079] MCW rank 8 bound to socket 0[core 0[hwt 0]], socket > >>> 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], socket 0[core 3[hwt 0]], > >>> socket 0[core 4[hwt 0]], socket 0[core 5[hwt 0]]: > >>> [B/B/B/B/B/B][./././././.] > >>> [csclprd3-0-0:31079] MCW rank 9 bound to socket 1[core 6[hwt 0]], socket > >>> 1[core 7[hwt 0]], socket 1[core 8[hwt 0]], socket 1[core 9[hwt 0]], > >>> socket 1[core 10[hwt 0]], socket 1[core 11[hwt 0]]: > >>> [./././././.][B/B/B/B/B/B] > >>> [csclprd3-0-10:19096] MCW rank 88 bound to socket 0[core 0[hwt 0-1]], > >>> socket 0[core 1[hwt 0-1]], socket 0[core 2[hwt 0-1]], socket 0[core 3[hwt > >>> 0-1]], socket 0[core 4[hwt 0-1]], socket 0[core 5[hwt 0-1]], socket > >>> 0[core 6[hwt 0-1]], socket 0[core 7[hwt 0-1]]: > >>> [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] > >>> [csclprd3-0-11:31636] MCW rank 119 bound to socket 1[core 8[hwt 0-1]], > >>> socket 1[core 9[hwt 0-1]], socket 1[core 10[hwt 0-1]], socket 1[core > >>> 11[hwt 0-1]], socket 1[core 12[hwt 0-1]], socket 1[core 13[hwt 0-1]], > >>> socket 1[core 14[hwt 0-1]], socket 1[core 15[hwt 0-1]]: > >>> [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] > >>> [csclprd3-0-7:20515] MCW rank 56 bound to socket 0[core 0[hwt 0-1]], > >>> socket 0[core 1[hwt 0-1]], socket 0[core 2[hwt 0-1]], socket 0[core 3[hwt > >>> 0-1]], socket 0[core 4[hwt 0-1]], socket 0[core 5[hwt 0-1]], socket > >>> 0[core 6[hwt 0-1]], socket 0[core 7[hwt 0-1]]: > >>> [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] > >>> [csclprd3-0-0:31079] MCW rank 10 bound to socket 0[core 0[hwt 0]], socket > >>> 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], socket 0[core 3[hwt 0]], > >>> socket 0[core 4[hwt 0]], socket 0[core 5[hwt 0]]: > >>> [B/B/B/B/B/B][./././././.] > >>> [csclprd3-0-7:20515] MCW rank 57 bound to socket 1[core 8[hwt 0-1]], > >>> socket 1[core 9[hwt 0-1]], socket 1[core 10[hwt 0-1]], socket 1[core > >>> 11[hwt 0-1]], socket 1[core 12[hwt 0-1]], socket 1[core 13[hwt 0-1]], > >>> socket 1[core 14[hwt 0-1]], socket 1[core 15[hwt 0-1]]: > >>> [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] > >>> [csclprd3-0-10:19096] MCW rank 89 bound to socket 1[core 8[hwt 0-1]], > >>> socket 1[core 9[hwt 0-1]], socket 1[core 10[hwt 0-1]], socket 1[core > >>> 11[hwt 0-1]], socket 1[core 12[hwt 0-1]], socket 1[core 13[hwt 0-1]], > >>> socket 1[core 14[hwt 0-1]], socket 1[core 15[hwt 0-1]]: > >>> [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] > >>> [csclprd3-0-11:31636] MCW rank 104 bound to socket 0[core 0[hwt 0-1]], > >>> socket 0[core 1[hwt 0-1]], socket 0[core 2[hwt 0-1]], socket 0[core 3[hwt > >>> 0-1]], socket 0[core 4[hwt 0-1]], socket 0[core 5[hwt 0-1]], socket > >>> 0[core 6[hwt 0-1]], socket 0[core 7[hwt 0-1]]: > >>> [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] > >>> [csclprd3-0-0:31079] MCW rank 11 bound to socket 1[core 6[hwt 0]], socket > >>> 1[core 7[hwt 0]], socket 1[core 8[hwt 0]], socket 1[core 9[hwt 0]], > >>> socket 1[core 10[hwt 0]], socket 1[core 11[hwt 0]]: > >>> [./././././.][B/B/B/B/B/B] > >>> [csclprd3-0-0:31079] MCW rank 12 bound to socket 0[core 0[hwt 0]], socket > >>> 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], socket 0[core 3[hwt 0]], > >>> socket 0[core 4[hwt 0]], socket 0[core 5[hwt 0]]: > >>> [B/B/B/B/B/B][./././././.] > >>> [csclprd3-0-4:30348] MCW rank 42 is not bound (or bound to all > >>> > >>> _______________________________________________ > >>> users mailing list > >>> us...@open-mpi.org <mailto:us...@open-mpi.org> > >>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > >>> <http://www.open-mpi.org/mailman/listinfo.cgi/users> > >>> Link to this post: > >>> http://www.open-mpi.org/community/lists/users/2015/06/27185.php > >>> <http://www.open-mpi.org/community/lists/users/2015/06/27185.php> > >>> > >>> IMPORTANT WARNING: This message is intended for the use of the person or > >>> entity to which it is addressed and may contain information that is > >>> privileged and confidential, the disclosure of which is governed by > >>> applicable law. If the reader of this message is not the intended > >>> recipient, or the employee or agent responsible for delivering it to the > >>> intended recipient, you are hereby notified that any dissemination, > >>> distribution or copying of this information is strictly prohibited. Thank > >>> you for your cooperation. > >>> _______________________________________________ > >>> users mailing list > >>> us...@open-mpi.org <mailto:us...@open-mpi.org> > >>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > >>> <http://www.open-mpi.org/mailman/listinfo.cgi/users> > >>> Link to this post: > >>> http://www.open-mpi.org/community/lists/users/2015/06/27204.php > >>> <http://www.open-mpi.org/community/lists/users/2015/06/27204.php> > >> > >> IMPORTANT WARNING: This message is intended for the use of the person or > >> entity to which it is addressed and may contain information that is > >> privileged and confidential, the disclosure of which is governed by > >> applicable law. If the reader of this message is not the intended > >> recipient, or the employee or agent responsible for delivering it to the > >> intended recipient, you are hereby notified that any dissemination, > >> distribution or copying of this information is strictly prohibited. Thank > >> you for your cooperation. > >> _______________________________________________ > >> users mailing list > >> us...@open-mpi.org <mailto:us...@open-mpi.org> > >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > >> <http://www.open-mpi.org/mailman/listinfo.cgi/users> > >> Link to this post: > >> http://www.open-mpi.org/community/lists/users/2015/06/27220.php > >> <http://www.open-mpi.org/community/lists/users/2015/06/27220.php> > > > > > > -- > > Jeff Squyres > > jsquy...@cisco.com <mailto:jsquy...@cisco.com> > > For corporate legal information go to: > > http://www.cisco.com/web/about/doing_business/legal/cri/ > > <http://www.cisco.com/web/about/doing_business/legal/cri/> > > > > _______________________________________________ > > users mailing list > > us...@open-mpi.org <mailto:us...@open-mpi.org> > > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > > <http://www.open-mpi.org/mailman/listinfo.cgi/users> > > Link to this post: > > http://www.open-mpi.org/community/lists/users/2015/06/27222.php > > <http://www.open-mpi.org/community/lists/users/2015/06/27222.php> > > _______________________________________________ > users mailing list > us...@open-mpi.org <mailto:us...@open-mpi.org> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > <http://www.open-mpi.org/mailman/listinfo.cgi/users> > Link to this post: > http://www.open-mpi.org/community/lists/users/2015/07/27261.php > <http://www.open-mpi.org/community/lists/users/2015/07/27261.php> > IMPORTANT WARNING: This message is intended for the use of the person or > entity to which it is addressed and may contain information that is > privileged and confidential, the disclosure of which is governed by > applicable law. If the reader of this message is not the intended recipient, > or the employee or agent responsible for delivering it to the intended > recipient, you are hereby notified that any dissemination, distribution or > copying of this information is strictly prohibited. Thank you for your > cooperation. > > _______________________________________________ > users mailing list > us...@open-mpi.org <mailto:us...@open-mpi.org> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > <http://www.open-mpi.org/mailman/listinfo.cgi/users> > Link to this post: > http://www.open-mpi.org/community/lists/users/2015/07/27263.php > <http://www.open-mpi.org/community/lists/users/2015/07/27263.php> > > IMPORTANT WARNING: This message is intended for the use of the person or > entity to which it is addressed and may contain information that is > privileged and confidential, the disclosure of which is governed by > applicable law. If the reader of this message is not the intended recipient, > or the employee or agent responsible for delivering it to the intended > recipient, you are hereby notified that any dissemination, distribution or > copying of this information is strictly prohibited. Thank you for your > cooperation. _______________________________________________ > users mailing list > us...@open-mpi.org <mailto:us...@open-mpi.org> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > <http://www.open-mpi.org/mailman/listinfo.cgi/users> > Link to this post: > http://www.open-mpi.org/community/lists/users/2015/07/27279.php > <http://www.open-mpi.org/community/lists/users/2015/07/27279.php>