Would the output of dmidecode -t processor and/or lstopo tell me conclusively
if hyperthreading is enabled or not? Hyperthreading is supposed to be enabled
for all the IBM x3550 M3 and M4 nodes, but I'm not sure if it actually is and I
don't have access to the BIOS settings.

-Bill L.

________________________________
From: users [users-boun...@open-mpi.org] on behalf of Ralph Castain 
[r...@open-mpi.org]
Sent: Saturday, June 27, 2015 7:21 PM
To: Open MPI Users
Subject: Re: [OMPI users] OpenMPI 1.8.6, CentOS 6.3, too many slots = crash

Bill - this is such a jumbled collection of machines that I’m having trouble 
figuring out what I should replicate. I can create something artificial here so 
I can try to debug this, but I need to know exactly what I’m up against - can 
you tell me:

* the architecture of each type - how many sockets, how many cores/socket, HT 
on or off. If two nodes have the same physical setup but one has HT on and the 
other off, then please consider those two different types

* how many nodes of each type

Looking at your map output, it looks like the map is being done correctly, but 
somehow the binding locale isn’t getting set in some cases. You latest error 
output would seem out-of-step with your prior reports, so something else may be 
going on there. As I said earlier, this is the most hetero environment we’ve 
seen, and so there may be some code paths your hitting that haven’t been well 
exercised before.




On Jun 26, 2015, at 5:22 PM, Lane, William 
<william.l...@cshs.org<mailto:william.l...@cshs.org>> wrote:

Well, I managed to get a successful mpirun @ a slot count of 132 using --mca 
btl ^sm,
however when I increased the slot count to 160, mpirun crashed without any error
output:

mpirun -np 160 -display-devel-map --prefix /hpc/apps/mpi/openmpi/1.8.6/ 
--hostfile hostfile-noslots --mca btl ^sm --hetero-nodes --bind-to core 
/hpc/home/lanew/mpi/openmpi/ProcessColors3 >> out.txt 2>&1

--------------------------------------------------------------------------
WARNING: a request was made to bind a process. While the system
supports binding the process itself, at least one node does NOT
support binding memory to the process location.

  Node:  csclprd3-6-1

This usually is due to not having the required NUMA support installed
on the node. In some Linux distributions, the required support is
contained in the libnumactl and libnumactl-devel packages.
This is a warning only; your job will continue, though performance may be 
degraded.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
A request was made to bind to that would result in binding more
processes than cpus on a resource:

   Bind to:     CORE
   Node:        csclprd3-6-1
   #processes:  2
   #cpus:       1

You can override this protection by adding the "overload-allowed"
option to your binding directive.
--------------------------------------------------------------------------

But csclprd3-6-1 (a blade) does have 2 CPU's on 2 separate sockets w/2 cores 
apiece as shown in my dmidecode output:

    csclprd3-6-1 ~]# dmidecode -t processor
    # dmidecode 2.11
    SMBIOS 2.4 present.

    Handle 0x0008, DMI type 4, 32 bytes
    Processor Information
            Socket Designation: Socket 1 CPU 1
            Type: Central Processor
            Family: Xeon
            Manufacturer: GenuineIntel
            ID: F6 06 00 00 01 03 00 00
            Signature: Type 0, Family 6, Model 15, Stepping 6
            Flags:
                    FPU (Floating-point unit on-chip)
                    CX8 (CMPXCHG8 instruction supported)
                    APIC (On-chip APIC hardware supported)
            Version: Intel Xeon
            Voltage: 2.9 V
            External Clock: 333 MHz
            Max Speed: 4000 MHz
            Current Speed: 3000 MHz
            Status: Populated, Enabled
            Upgrade: ZIF Socket
            L1 Cache Handle: 0x0004
            L2 Cache Handle: 0x0005
            L3 Cache Handle: Not Provided

    Handle 0x0009, DMI type 4, 32 bytes
    Processor Information
            Socket Designation: Socket 2 CPU 2
            Type: Central Processor
            Family: Xeon
            Manufacturer: GenuineIntel
            ID: F6 06 00 00 01 03 00 00
            Signature: Type 0, Family 6, Model 15, Stepping 6
            Flags:
                    FPU (Floating-point unit on-chip)
                    CX8 (CMPXCHG8 instruction supported)
                    APIC (On-chip APIC hardware supported)
            Version: Intel Xeon
            Voltage: 2.9 V
            External Clock: 333 MHz
            Max Speed: 4000 MHz
            Current Speed: 3000 MHz
            Status: Populated, Enabled
            Upgrade: ZIF Socket
            L1 Cache Handle: 0x0006
            L2 Cache Handle: 0x0007
            L3 Cache Handle: Not Provided

    csclprd3-6-1 ~]# lstopo
    Machine (16GB)
      Socket L#0 + L2 L#0 (4096KB)
        L1d L#0 (32KB) + L1i L#0 (32KB) + Core L#0 + PU L#0 (P#0)
        L1d L#1 (32KB) + L1i L#1 (32KB) + Core L#1 + PU L#1 (P#2)
      Socket L#1 + L2 L#1 (4096KB)
        L1d L#2 (32KB) + L1i L#2 (32KB) + Core L#2 + PU L#2 (P#1)
        L1d L#3 (32KB) + L1i L#3 (32KB) + Core L#3 + PU L#3 (P#3)

csclprd3-0-1 information (which looks correct as this particular x3550 should
have one socket populated (of two) with a 6 core Xeon (or 12 cores 
w/hyperthreading
turned on):

    csclprd3-0-1 ~]# lstopo
    Machine (71GB)
      Socket L#0 + L3 L#0 (12MB)
        L2 L#0 (256KB) + L1d L#0 (32KB) + L1i L#0 (32KB) + Core L#0 + PU L#0 
(P#0)
        L2 L#1 (256KB) + L1d L#1 (32KB) + L1i L#1 (32KB) + Core L#1 + PU L#1 
(P#1)
        L2 L#2 (256KB) + L1d L#2 (32KB) + L1i L#2 (32KB) + Core L#2 + PU L#2 
(P#2)
        L2 L#3 (256KB) + L1d L#3 (32KB) + L1i L#3 (32KB) + Core L#3 + PU L#3 
(P#3)
        L2 L#4 (256KB) + L1d L#4 (32KB) + L1i L#4 (32KB) + Core L#4 + PU L#4 
(P#4)
        L2 L#5 (256KB) + L1d L#5 (32KB) + L1i L#5 (32KB) + Core L#5 + PU L#5 
(P#5)

    csclprd3-0-1 ~]# dmidecode -t processor
    # dmidecode 2.11
    # SMBIOS entry point at 0x7f6be000
    SMBIOS 2.5 present.

    Handle 0x0001, DMI type 4, 40 bytes
    Processor Information
            Socket Designation: Node 1 Socket 1
            Type: Central Processor
            Family: Xeon MP
            Manufacturer: Intel(R) Corporation
            ID: C2 06 02 00 FF FB EB BF
            Signature: Type 0, Family 6, Model 44, Stepping 2
            Flags:
                    FPU (Floating-point unit on-chip)
                    VME (Virtual mode extension)
                    DE (Debugging extension)
                    PSE (Page size extension)
                    TSC (Time stamp counter)
                    MSR (Model specific registers)
                    PAE (Physical address extension)
                    MCE (Machine check exception)
                    CX8 (CMPXCHG8 instruction supported)
                    APIC (On-chip APIC hardware supported)
                    SEP (Fast system call)
                    MTRR (Memory type range registers)
                    PGE (Page global enable)
                    MCA (Machine check architecture)
                    CMOV (Conditional move instruction supported)
                    PAT (Page attribute table)
                    PSE-36 (36-bit page size extension)
                    CLFSH (CLFLUSH instruction supported)
                    DS (Debug store)
                    ACPI (ACPI supported)
                    MMX (MMX technology supported)
                    FXSR (FXSAVE and FXSTOR instructions supported)
                    SSE (Streaming SIMD extensions)
                    SSE2 (Streaming SIMD extensions 2)
                    SS (Self-snoop)
                    HTT (Multi-threading)
                    TM (Thermal monitor supported)
                    PBE (Pending break enabled)
            Version: Intel(R) Xeon(R) CPU           E5645  @ 2.40GHz
            Voltage: 1.2 V
            External Clock: 5866 MHz
            Max Speed: 4400 MHz
            Current Speed: 2400 MHz
            Status: Populated, Enabled
            Upgrade: ZIF Socket
            L1 Cache Handle: 0x0002
            L2 Cache Handle: 0x0003
            L3 Cache Handle: 0x0004
            Serial Number: Not Specified
            Asset Tag: Not Specified
            Part Number: Not Specified
            Core Count: 6
            Core Enabled: 6
            Thread Count: 6
            Characteristics:
                    64-bit capable

    Handle 0x005A, DMI type 4, 40 bytes
    Processor Information
            Socket Designation: Node 1 Socket 2
            Type: Central Processor
            Family: Xeon MP
            Manufacturer: Not Specified
            ID: 00 00 00 00 00 00 00 00
            Signature: Type 0, Family 0, Model 0, Stepping 0
            Flags: None
            Version: Not Specified
            Voltage: 1.2 V
            External Clock: 5866 MHz
            Max Speed: 4400 MHz
            Current Speed: Unknown
            Status: Unpopulated
            Upgrade: ZIF Socket
            L1 Cache Handle: Not Provided
            L2 Cache Handle: Not Provided
            L3 Cache Handle: Not Provided
            Serial Number: Not Specified
            Asset Tag: Not Specified
            Part Number: Not Specified
            Characteristics: None


________________________________
From: users [users-boun...@open-mpi.org<mailto:users-boun...@open-mpi.org>] on 
behalf of Ralph Castain [r...@open-mpi.org<mailto:r...@open-mpi.org>]
Sent: Wednesday, June 24, 2015 6:06 AM
To: Open MPI Users
Subject: Re: [OMPI users] OpenMPI 1.8.6, CentOS 6.3, too many slots = crash

I think trying with --mca btl ^sm makes a lot of sense and may solve the 
problem. I also noted that we are having trouble with the topology of several 
of the nodes - seeing only one socket, non-HT where you say we should see two 
sockets and HT-enabled. In those cases, the locality is "unknown" - given that 
those procs are on remote nodes from the one being impacted, I don't think it 
should cause a problem. However, it isn't correct, and that raises flags.

My best guess of the root cause of that error is either we are getting bad 
topology info on those nodes, or we have a bug that is mishandling this 
scenario. It would probably be good to get this error fixed to ensure it isn't 
the source of the eventual crash, even though I'm not sure they are related.

Bill: Can we examine one of the problem nodes? Let's pick csclprd3-0-1 (or take 
another one from your list - just look for one where "locality" is reported as 
"unknown" for the procs in the output map). Can you run lstopo on that node and 
send us the output? In the above map, it is reporting a single socket with 6 
cores, non-HT. Is that what lstopo shows when run on the node? Is it what you 
expected?


On Wed, Jun 24, 2015 at 4:07 AM, Gilles Gouaillardet 
<gilles.gouaillar...@gmail.com<mailto:gilles.gouaillar...@gmail.com>> wrote:
Bill,

were you able to get a core file and analyze the stack with gdb ?

I suspect the error occurs in mca_btl_sm_add_procs but this is just my best 
guess.
if this is correct, can you check the value of 
mca_btl_sm_component.num_smp_procs ?

as a workaround, can you try
mpirun --mca btl ^sm ...

I do not see how I can tackle the root cause without being able to reproduce 
the issue :-(

can you try to reproduce the issue with the smallest hostfile, and then run 
lstopo on all the nodes ?
btw, you are not mixing 32 bits and 64 bits OS, are you ?

Cheers,

Gilles



mca_btl_sm_add_procs(



int mca_btl_sm_add_procs(

On Wednesday, June 24, 2015, Lane, William 
<william.l...@cshs.org<mailto:william.l...@cshs.org>> wrote:
Gilles,

All the blades only have two core Xeons (without hyperthreading) populating 
both their sockets. All
the x3550 nodes have hyperthreading capable Xeons and Sandybridge server CPU's. 
It's possible
hyperthreading has been disabled on some of these nodes though. The 3-0-n nodes 
are all IBM x3550
nodes while the 3-6-n nodes are all blade nodes.

I have run this exact same test code successfully in the past on another 
cluster (~200 nodes of Sunfire X2100
2x dual-core Opterons) w/no issues on upwards of 390 slots. I even tested it 
recently on OpenMPI 1.8.5
on another smaller R&D cluster consisting of 10 Sunfire X2100 nodes (w/2 dual 
core Opterons apiece).
On this particular cluster I've had success running this code on < 132 slots.

Anyway, here's the results of the following mpirun:

mpirun -np 132 -display-devel-map --prefix /hpc/apps/mpi/openmpi/1.8.6/ 
--hostfile hostfile-noslots --mca btl_tcp_if_include eth0 --hetero-nodes 
--bind-to core /hpc/home/lanew/mpi/openmpi/ProcessColors3 >> out.txt 2>&1

--------------------------------------------------------------------------
WARNING: a request was made to bind a process. While the system
supports binding the process itself, at least one node does NOT
support binding memory to the process location.

  Node:  csclprd3-6-1

This usually is due to not having the required NUMA support installed
on the node. In some Linux distributions, the required support is
contained in the libnumactl and libnumactl-devel packages.
This is a warning only; your job will continue, though performance may be 
degraded.
--------------------------------------------------------------------------
 Data for JOB [51718,1] offset 0

 Mapper requested: NULL  Last mapper: round_robin  Mapping policy: BYSOCKET  
Ranking policy: SLOT
 Binding policy: CORE  Cpu set: NULL  PPR: NULL  Cpus-per-rank: 1
     Num new daemons: 0    New daemon starting vpid INVALID
     Num nodes: 15

 Data for node: csclprd3-6-1         Launch id: -1    State: 0
     Daemon: [[51718,0],1]    Daemon launched: True
     Num slots: 4    Slots in use: 4    Oversubscribed: FALSE
     Num slots allocated: 4    Max slots: 0
     Username on node: NULL
     Num procs: 4    Next node_rank: 4
     Data for proc: [[51718,1],0]
         Pid: 0    Local rank: 0    Node rank: 0    App rank: 0
         State: INITIALIZED    App_context: 0
         Locale: [B/B][./.]
         Binding: [B/.][./.]
     Data for proc: [[51718,1],1]
         Pid: 0    Local rank: 1    Node rank: 1    App rank: 1
         State: INITIALIZED    App_context: 0
         Locale: [./.][B/B]
         Binding: [./.][B/.]
     Data for proc: [[51718,1],2]
         Pid: 0    Local rank: 2    Node rank: 2    App rank: 2
         State: INITIALIZED    App_context: 0
         Locale: [B/B][./.]
         Binding: [./B][./.]
     Data for proc: [[51718,1],3]
         Pid: 0    Local rank: 3    Node rank: 3    App rank: 3
         State: INITIALIZED    App_context: 0
         Locale: [./.][B/B]
         Binding: [./.][./B]

 Data for node: csclprd3-6-5         Launch id: -1    State: 0
     Daemon: [[51718,0],2]    Daemon launched: True
     Num slots: 4    Slots in use: 4    Oversubscribed: FALSE
     Num slots allocated: 4    Max slots: 0
     Username on node: NULL
     Num procs: 4    Next node_rank: 4
     Data for proc: [[51718,1],4]
         Pid: 0    Local rank: 0    Node rank: 0    App rank: 4
         State: INITIALIZED    App_context: 0
         Locale: [B/B][./.]
         Binding: [B/.][./.]
     Data for proc: [[51718,1],5]
         Pid: 0    Local rank: 1    Node rank: 1    App rank: 5
         State: INITIALIZED    App_context: 0
         Locale: [./.][B/B]
         Binding: [./.][B/.]
     Data for proc: [[51718,1],6]
         Pid: 0    Local rank: 2    Node rank: 2    App rank: 6
         State: INITIALIZED    App_context: 0
         Locale: [B/B][./.]
         Binding: [./B][./.]
     Data for proc: [[51718,1],7]
         Pid: 0    Local rank: 3    Node rank: 3    App rank: 7
         State: INITIALIZED    App_context: 0
         Locale: [./.][B/B]
         Binding: [./.][./B]

 Data for node: csclprd3-0-0         Launch id: -1    State: 0
     Daemon: [[51718,0],3]    Daemon launched: True
     Num slots: 12    Slots in use: 12    Oversubscribed: FALSE
     Num slots allocated: 12    Max slots: 0
     Username on node: NULL
     Num procs: 12    Next node_rank: 12
     Data for proc: [[51718,1],8]
         Pid: 0    Local rank: 0    Node rank: 0    App rank: 8
         State: INITIALIZED    App_context: 0
         Locale: [B/B/B/B/B/B][./././././.]
         Binding: [B/././././.][./././././.]
     Data for proc: [[51718,1],9]
         Pid: 0    Local rank: 1    Node rank: 1    App rank: 9
         State: INITIALIZED    App_context: 0
         Locale: [./././././.][B/B/B/B/B/B]
         Binding: [./././././.][B/././././.]
     Data for proc: [[51718,1],10]
         Pid: 0    Local rank: 2    Node rank: 2    App rank: 10
         State: INITIALIZED    App_context: 0
         Locale: [B/B/B/B/B/B][./././././.]
         Binding: [./B/./././.][./././././.]
     Data for proc: [[51718,1],11]
         Pid: 0    Local rank: 3    Node rank: 3    App rank: 11
         State: INITIALIZED    App_context: 0
         Locale: [./././././.][B/B/B/B/B/B]
         Binding: [./././././.][./B/./././.]
     Data for proc: [[51718,1],12]
         Pid: 0    Local rank: 4    Node rank: 4    App rank: 12
         State: INITIALIZED    App_context: 0
         Locale: [B/B/B/B/B/B][./././././.]
         Binding: [././B/././.][./././././.]
     Data for proc: [[51718,1],13]
         Pid: 0    Local rank: 5    Node rank: 5    App rank: 13
         State: INITIALIZED    App_context: 0
         Locale: [./././././.][B/B/B/B/B/B]
         Binding: [./././././.][././B/././.]
     Data for proc: [[51718,1],14]
         Pid: 0    Local rank: 6    Node rank: 6    App rank: 14
         State: INITIALIZED    App_context: 0
         Locale: [B/B/B/B/B/B][./././././.]
         Binding: [./././B/./.][./././././.]
     Data for proc: [[51718,1],15]
         Pid: 0    Local rank: 7    Node rank: 7    App rank: 15
         State: INITIALIZED    App_context: 0
         Locale: [./././././.][B/B/B/B/B/B]
         Binding: [./././././.][./././B/./.]
     Data for proc: [[51718,1],16]
         Pid: 0    Local rank: 8    Node rank: 8    App rank: 16
         State: INITIALIZED    App_context: 0
         Locale: [B/B/B/B/B/B][./././././.]
         Binding: [././././B/.][./././././.]
     Data for proc: [[51718,1],17]
         Pid: 0    Local rank: 9    Node rank: 9    App rank: 17
         State: INITIALIZED    App_context: 0
         Locale: [./././././.][B/B/B/B/B/B]
         Binding: [./././././.][././././B/.]
     Data for proc: [[51718,1],18]
         Pid: 0    Local rank: 10    Node rank: 10    App rank: 18
         State: INITIALIZED    App_context: 0
         Locale: [B/B/B/B/B/B][./././././.]
         Binding: [./././././B][./././././.]
     Data for proc: [[51718,1],19]
         Pid: 0    Local rank: 11    Node rank: 11    App rank: 19
         State: INITIALIZED    App_context: 0
         Locale: [./././././.][B/B/B/B/B/B]
         Binding: [./././././.][./././././B]

 Data for node: csclprd3-0-1         Launch id: -1    State: 0
     Daemon: [[51718,0],4]    Daemon launched: True
     Num slots: 6    Slots in use: 6    Oversubscribed: FALSE
     Num slots allocated: 6    Max slots: 0
     Username on node: NULL
     Num procs: 6    Next node_rank: 6
     Data for proc: [[51718,1],20]
         Pid: 0    Local rank: 0    Node rank: 0    App rank: 20
         State: INITIALIZED    App_context: 0
         Locale: UNKNOWN
         Binding: [B/././././.]
     Data for proc: [[51718,1],21]
         Pid: 0    Local rank: 1    Node rank: 1    App rank: 21
         State: INITIALIZED    App_context: 0
         Locale: UNKNOWN
         Binding: [./B/./././.]
     Data for proc: [[51718,1],22]
         Pid: 0    Local rank: 2    Node rank: 2    App rank: 22
         State: INITIALIZED    App_context: 0
         Locale: UNKNOWN
         Binding: [././B/././.]
     Data for proc: [[51718,1],23]
         Pid: 0    Local rank: 3    Node rank: 3    App rank: 23
         State: INITIALIZED    App_context: 0
         Locale: UNKNOWN
         Binding: [./././B/./.]
     Data for proc: [[51718,1],24]
         Pid: 0    Local rank: 4    Node rank: 4    App rank: 24
         State: INITIALIZED    App_context: 0
         Locale: UNKNOWN
         Binding: [././././B/.]
     Data for proc: [[51718,1],25]
         Pid: 0    Local rank: 5    Node rank: 5    App rank: 25
         State: INITIALIZED    App_context: 0
         Locale: UNKNOWN
         Binding: [./././././B]

 Data for node: csclprd3-0-2         Launch id: -1    State: 0
     Daemon: [[51718,0],5]    Daemon launched: True
     Num slots: 6    Slots in use: 6    Oversubscribed: FALSE
     Num slots allocated: 6    Max slots: 0
     Username on node: NULL
     Num procs: 6    Next node_rank: 6
     Data for proc: [[51718,1],26]
         Pid: 0    Local rank: 0    Node rank: 0    App rank: 26
         State: INITIALIZED    App_context: 0
         Locale: UNKNOWN
         Binding: [B/././././.]
     Data for proc: [[51718,1],27]
         Pid: 0    Local rank: 1    Node rank: 1    App rank: 27
         State: INITIALIZED    App_context: 0
         Locale: UNKNOWN
         Binding: [./B/./././.]
     Data for proc: [[51718,1],28]
         Pid: 0    Local rank: 2    Node rank: 2    App rank: 28
         State: INITIALIZED    App_context: 0
         Locale: UNKNOWN
         Binding: [././B/././.]
     Data for proc: [[51718,1],29]
         Pid: 0    Local rank: 3    Node rank: 3    App rank: 29
         State: INITIALIZED    App_context: 0
         Locale: UNKNOWN
         Binding: [./././B/./.]
     Data for proc: [[51718,1],30]
         Pid: 0    Local rank: 4    Node rank: 4    App rank: 30
         State: INITIALIZED    App_context: 0
         Locale: UNKNOWN
         Binding: [././././B/.]
     Data for proc: [[51718,1],31]
         Pid: 0    Local rank: 5    Node rank: 5    App rank: 31
         State: INITIALIZED    App_context: 0
         Locale: UNKNOWN
         Binding: [./././././B]

 Data for node: csclprd3-0-3         Launch id: -1    State: 0
     Daemon: [[51718,0],6]    Daemon launched: True
     Num slots: 6    Slots in use: 6    Oversubscribed: FALSE
     Num slots allocated: 6    Max slots: 0
     Username on node: NULL
     Num procs: 6    Next node_rank: 6
     Data for proc: [[51718,1],32]
         Pid: 0    Local rank: 0    Node rank: 0    App rank: 32
         State: INITIALIZED    App_context: 0
         Locale: UNKNOWN
         Binding: [B/././././.]
     Data for proc: [[51718,1],33]
         Pid: 0    Local rank: 1    Node rank: 1    App rank: 33
         State: INITIALIZED    App_context: 0
         Locale: UNKNOWN
         Binding: [./B/./././.]
     Data for proc: [[51718,1],34]
         Pid: 0    Local rank: 2    Node rank: 2    App rank: 34
         State: INITIALIZED    App_context: 0
         Locale: UNKNOWN
         Binding: [././B/././.]
     Data for proc: [[51718,1],35]
         Pid: 0    Local rank: 3    Node rank: 3    App rank: 35
         State: INITIALIZED    App_context: 0
         Locale: UNKNOWN
         Binding: [./././B/./.]
     Data for proc: [[51718,1],36]
         Pid: 0    Local rank: 4    Node rank: 4    App rank: 36
         State: INITIALIZED    App_context: 0
         Locale: UNKNOWN
         Binding: [././././B/.]
     Data for proc: [[51718,1],37]
         Pid: 0    Local rank: 5    Node rank: 5    App rank: 37
         State: INITIALIZED    App_context: 0
         Locale: UNKNOWN
         Binding: [./././././B]

 Data for node: csclprd3-0-4         Launch id: -1    State: 0
     Daemon: [[51718,0],7]    Daemon launched: True
     Num slots: 6    Slots in use: 6    Oversubscribed: FALSE
     Num slots allocated: 6    Max slots: 0
     Username on node: NULL
     Num procs: 6    Next node_rank: 6
     Data for proc: [[51718,1],38]
         Pid: 0    Local rank: 0    Node rank: 0    App rank: 38
         State: INITIALIZED    App_context: 0
         Locale: UNKNOWN
         Binding: [B/././././.]
     Data for proc: [[51718,1],39]
         Pid: 0    Local rank: 1    Node rank: 1    App rank: 39
         State: INITIALIZED    App_context: 0
         Locale: UNKNOWN
         Binding: [./B/./././.]
     Data for proc: [[51718,1],40]
         Pid: 0    Local rank: 2    Node rank: 2    App rank: 40
         State: INITIALIZED    App_context: 0
         Locale: UNKNOWN
         Binding: [././B/././.]
     Data for proc: [[51718,1],41]
         Pid: 0    Local rank: 3    Node rank: 3    App rank: 41
         State: INITIALIZED    App_context: 0
         Locale: UNKNOWN
         Binding: [./././B/./.]
     Data for proc: [[51718,1],42]
         Pid: 0    Local rank: 4    Node rank: 4    App rank: 42
         State: INITIALIZED    App_context: 0
         Locale: UNKNOWN
         Binding: [././././B/.]
     Data for proc: [[51718,1],43]
         Pid: 0    Local rank: 5    Node rank: 5    App rank: 43
         State: INITIALIZED    App_context: 0
         Locale: UNKNOWN
         Binding: [./././././B]

 Data for node: csclprd3-0-5         Launch id: -1    State: 0
     Daemon: [[51718,0],8]    Daemon launched: True
     Num slots: 6    Slots in use: 6    Oversubscribed: FALSE
     Num slots allocated: 6    Max slots: 0
     Username on node: NULL
     Num procs: 6    Next node_rank: 6
     Data for proc: [[51718,1],44]
         Pid: 0    Local rank: 0    Node rank: 0    App rank: 44
         State: INITIALIZED    App_context: 0
         Locale: UNKNOWN
         Binding: [B/././././.]
     Data for proc: [[51718,1],45]
         Pid: 0    Local rank: 1    Node rank: 1    App rank: 45
         State: INITIALIZED    App_context: 0
         Locale: UNKNOWN
         Binding: [./B/./././.]
     Data for proc: [[51718,1],46]
         Pid: 0    Local rank: 2    Node rank: 2    App rank: 46
         State: INITIALIZED    App_context: 0
         Locale: UNKNOWN
         Binding: [././B/././.]
     Data for proc: [[51718,1],47]
         Pid: 0    Local rank: 3    Node rank: 3    App rank: 47
         State: INITIALIZED    App_context: 0
         Locale: UNKNOWN
         Binding: [./././B/./.]
     Data for proc: [[51718,1],48]
         Pid: 0    Local rank: 4    Node rank: 4    App rank: 48
         State: INITIALIZED    App_context: 0
         Locale: UNKNOWN
         Binding: [././././B/.]
     Data for proc: [[51718,1],49]
         Pid: 0    Local rank: 5    Node rank: 5    App rank: 49
         State: INITIALIZED    App_context: 0
         Locale: UNKNOWN
         Binding: [./././././B]

 Data for node: csclprd3-0-6         Launch id: -1    State: 0
     Daemon: [[51718,0],9]    Daemon launched: True
     Num slots: 6    Slots in use: 6    Oversubscribed: FALSE
     Num slots allocated: 6    Max slots: 0
     Username on node: NULL
     Num procs: 6    Next node_rank: 6
     Data for proc: [[51718,1],50]
         Pid: 0    Local rank: 0    Node rank: 0    App rank: 50
         State: INITIALIZED    App_context: 0
         Locale: UNKNOWN
         Binding: [B/././././.]
     Data for proc: [[51718,1],51]
         Pid: 0    Local rank: 1    Node rank: 1    App rank: 51
         State: INITIALIZED    App_context: 0
         Locale: UNKNOWN
         Binding: [./B/./././.]
     Data for proc: [[51718,1],52]
         Pid: 0    Local rank: 2    Node rank: 2    App rank: 52
         State: INITIALIZED    App_context: 0
         Locale: UNKNOWN
         Binding: [././B/././.]
     Data for proc: [[51718,1],53]
         Pid: 0    Local rank: 3    Node rank: 3    App rank: 53
         State: INITIALIZED    App_context: 0
         Locale: UNKNOWN
         Binding: [./././B/./.]
     Data for proc: [[51718,1],54]
         Pid: 0    Local rank: 4    Node rank: 4    App rank: 54
         State: INITIALIZED    App_context: 0
         Locale: UNKNOWN
         Binding: [././././B/.]
     Data for proc: [[51718,1],55]
         Pid: 0    Local rank: 5    Node rank: 5    App rank: 55
         State: INITIALIZED    App_context: 0
         Locale: UNKNOWN
         Binding: [./././././B]

 Data for node: csclprd3-0-7         Launch id: -1    State: 0
     Daemon: [[51718,0],10]    Daemon launched: True
     Num slots: 16    Slots in use: 16    Oversubscribed: FALSE
     Num slots allocated: 16    Max slots: 0
     Username on node: NULL
     Num procs: 16    Next node_rank: 16
     Data for proc: [[51718,1],56]
         Pid: 0    Local rank: 0    Node rank: 0    App rank: 56
         State: INITIALIZED    App_context: 0
         Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..]
         Binding: [BB/../../../../../../..][../../../../../../../..]
     Data for proc: [[51718,1],57]
         Pid: 0    Local rank: 1    Node rank: 1    App rank: 57
         State: INITIALIZED    App_context: 0
         Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB]
         Binding: [../../../../../../../..][BB/../../../../../../..]
     Data for proc: [[51718,1],58]
         Pid: 0    Local rank: 2    Node rank: 2    App rank: 58
         State: INITIALIZED    App_context: 0
         Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..]
         Binding: [../BB/../../../../../..][../../../../../../../..]
     Data for proc: [[51718,1],59]
         Pid: 0    Local rank: 3    Node rank: 3    App rank: 59
         State: INITIALIZED    App_context: 0
         Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB]
         Binding: [../../../../../../../..][../BB/../../../../../..]
     Data for proc: [[51718,1],60]
         Pid: 0    Local rank: 4    Node rank: 4    App rank: 60
         State: INITIALIZED    App_context: 0
         Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..]
         Binding: [../../BB/../../../../..][../../../../../../../..]
     Data for proc: [[51718,1],61]
         Pid: 0    Local rank: 5    Node rank: 5    App rank: 61
         State: INITIALIZED    App_context: 0
         Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB]
         Binding: [../../../../../../../..][../../BB/../../../../..]
     Data for proc: [[51718,1],62]
         Pid: 0    Local rank: 6    Node rank: 6    App rank: 62
         State: INITIALIZED    App_context: 0
         Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..]
         Binding: [../../../BB/../../../..][../../../../../../../..]
     Data for proc: [[51718,1],63]
         Pid: 0    Local rank: 7    Node rank: 7    App rank: 63
         State: INITIALIZED    App_context: 0
         Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB]
         Binding: [../../../../../../../..][../../../BB/../../../..]
     Data for proc: [[51718,1],64]
         Pid: 0    Local rank: 8    Node rank: 8    App rank: 64
         State: INITIALIZED    App_context: 0
         Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..]
         Binding: [../../../../BB/../../..][../../../../../../../..]
     Data for proc: [[51718,1],65]
         Pid: 0    Local rank: 9    Node rank: 9    App rank: 65
         State: INITIALIZED    App_context: 0
         Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB]
         Binding: [../../../../../../../..][../../../../BB/../../..]
     Data for proc: [[51718,1],66]
         Pid: 0    Local rank: 10    Node rank: 10    App rank: 66
         State: INITIALIZED    App_context: 0
         Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..]
         Binding: [../../../../../BB/../..][../../../../../../../..]
     Data for proc: [[51718,1],67]
         Pid: 0    Local rank: 11    Node rank: 11    App rank: 67
         State: INITIALIZED    App_context: 0
         Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB]
         Binding: [../../../../../../../..][../../../../../BB/../..]
     Data for proc: [[51718,1],68]
         Pid: 0    Local rank: 12    Node rank: 12    App rank: 68
         State: INITIALIZED    App_context: 0
         Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..]
         Binding: [../../../../../../BB/..][../../../../../../../..]
     Data for proc: [[51718,1],69]
         Pid: 0    Local rank: 13    Node rank: 13    App rank: 69
         State: INITIALIZED    App_context: 0
         Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB]
         Binding: [../../../../../../../..][../../../../../../BB/..]
     Data for proc: [[51718,1],70]
         Pid: 0    Local rank: 14    Node rank: 14    App rank: 70
         State: INITIALIZED    App_context: 0
         Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..]
         Binding: [../../../../../../../BB][../../../../../../../..]
     Data for proc: [[51718,1],71]
         Pid: 0    Local rank: 15    Node rank: 15    App rank: 71
         State: INITIALIZED    App_context: 0
         Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB]
         Binding: [../../../../../../../..][../../../../../../../BB]

 Data for node: csclprd3-0-8         Launch id: -1    State: 0
     Daemon: [[51718,0],11]    Daemon launched: True
     Num slots: 16    Slots in use: 16    Oversubscribed: FALSE
     Num slots allocated: 16    Max slots: 0
     Username on node: NULL
     Num procs: 16    Next node_rank: 16
     Data for proc: [[51718,1],72]
         Pid: 0    Local rank: 0    Node rank: 0    App rank: 72
         State: INITIALIZED    App_context: 0
         Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..]
         Binding: [BB/../../../../../../..][../../../../../../../..]
     Data for proc: [[51718,1],73]
         Pid: 0    Local rank: 1    Node rank: 1    App rank: 73
         State: INITIALIZED    App_context: 0
         Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB]
         Binding: [../../../../../../../..][BB/../../../../../../..]
     Data for proc: [[51718,1],74]
         Pid: 0    Local rank: 2    Node rank: 2    App rank: 74
         State: INITIALIZED    App_context: 0
         Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..]
         Binding: [../BB/../../../../../..][../../../../../../../..]
     Data for proc: [[51718,1],75]
         Pid: 0    Local rank: 3    Node rank: 3    App rank: 75
         State: INITIALIZED    App_context: 0
         Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB]
         Binding: [../../../../../../../..][../BB/../../../../../..]
     Data for proc: [[51718,1],76]
         Pid: 0    Local rank: 4    Node rank: 4    App rank: 76
         State: INITIALIZED    App_context: 0
         Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..]
         Binding: [../../BB/../../../../..][../../../../../../../..]
     Data for proc: [[51718,1],77]
         Pid: 0    Local rank: 5    Node rank: 5    App rank: 77
         State: INITIALIZED    App_context: 0
         Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB]
         Binding: [../../../../../../../..][../../BB/../../../../..]
     Data for proc: [[51718,1],78]
         Pid: 0    Local rank: 6    Node rank: 6    App rank: 78
         State: INITIALIZED    App_context: 0
         Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..]
         Binding: [../../../BB/../../../..][../../../../../../../..]
     Data for proc: [[51718,1],79]
         Pid: 0    Local rank: 7    Node rank: 7    App rank: 79
         State: INITIALIZED    App_context: 0
         Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB]
         Binding: [../../../../../../../..][../../../BB/../../../..]
     Data for proc: [[51718,1],80]
         Pid: 0    Local rank: 8    Node rank: 8    App rank: 80
         State: INITIALIZED    App_context: 0
         Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..]
         Binding: [../../../../BB/../../..][../../../../../../../..]
     Data for proc: [[51718,1],81]
         Pid: 0    Local rank: 9    Node rank: 9    App rank: 81
         State: INITIALIZED    App_context: 0
         Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB]
         Binding: [../../../../../../../..][../../../../BB/../../..]
     Data for proc: [[51718,1],82]
         Pid: 0    Local rank: 10    Node rank: 10    App rank: 82
         State: INITIALIZED    App_context: 0
         Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..]
         Binding: [../../../../../BB/../..][../../../../../../../..]
     Data for proc: [[51718,1],83]
         Pid: 0    Local rank: 11    Node rank: 11    App rank: 83
         State: INITIALIZED    App_context: 0
         Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB]
         Binding: [../../../../../../../..][../../../../../BB/../..]
     Data for proc: [[51718,1],84]
         Pid: 0    Local rank: 12    Node rank: 12    App rank: 84
         State: INITIALIZED    App_context: 0
         Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..]
         Binding: [../../../../../../BB/..][../../../../../../../..]
     Data for proc: [[51718,1],85]
         Pid: 0    Local rank: 13    Node rank: 13    App rank: 85
         State: INITIALIZED    App_context: 0
         Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB]
         Binding: [../../../../../../../..][../../../../../../BB/..]
     Data for proc: [[51718,1],86]
         Pid: 0    Local rank: 14    Node rank: 14    App rank: 86
         State: INITIALIZED    App_context: 0
         Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..]
         Binding: [../../../../../../../BB][../../../../../../../..]
     Data for proc: [[51718,1],87]
         Pid: 0    Local rank: 15    Node rank: 15    App rank: 87
         State: INITIALIZED    App_context: 0
         Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB]
         Binding: [../../../../../../../..][../../../../../../../BB]

 Data for node: csclprd3-0-10         Launch id: -1    State: 0
     Daemon: [[51718,0],12]    Daemon launched: True
     Num slots: 16    Slots in use: 16    Oversubscribed: FALSE
     Num slots allocated: 16    Max slots: 0
     Username on node: NULL
     Num procs: 16    Next node_rank: 16
     Data for proc: [[51718,1],88]
         Pid: 0    Local rank: 0    Node rank: 0    App rank: 88
         State: INITIALIZED    App_context: 0
         Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..]
         Binding: [BB/../../../../../../..][../../../../../../../..]
     Data for proc: [[51718,1],89]
         Pid: 0    Local rank: 1    Node rank: 1    App rank: 89
         State: INITIALIZED    App_context: 0
         Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB]
         Binding: [../../../../../../../..][BB/../../../../../../..]
     Data for proc: [[51718,1],90]
         Pid: 0    Local rank: 2    Node rank: 2    App rank: 90
         State: INITIALIZED    App_context: 0
         Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..]
         Binding: [../BB/../../../../../..][../../../../../../../..]
     Data for proc: [[51718,1],91]
         Pid: 0    Local rank: 3    Node rank: 3    App rank: 91
         State: INITIALIZED    App_context: 0
         Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB]
         Binding: [../../../../../../../..][../BB/../../../../../..]
     Data for proc: [[51718,1],92]
         Pid: 0    Local rank: 4    Node rank: 4    App rank: 92
         State: INITIALIZED    App_context: 0
         Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..]
         Binding: [../../BB/../../../../..][../../../../../../../..]
     Data for proc: [[51718,1],93]
         Pid: 0    Local rank: 5    Node rank: 5    App rank: 93
         State: INITIALIZED    App_context: 0
         Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB]
         Binding: [../../../../../../../..][../../BB/../../../../..]
     Data for proc: [[51718,1],94]
         Pid: 0    Local rank: 6    Node rank: 6    App rank: 94
         State: INITIALIZED    App_context: 0
         Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..]
         Binding: [../../../BB/../../../..][../../../../../../../..]
     Data for proc: [[51718,1],95]
         Pid: 0    Local rank: 7    Node rank: 7    App rank: 95
         State: INITIALIZED    App_context: 0
         Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB]
         Binding: [../../../../../../../..][../../../BB/../../../..]
     Data for proc: [[51718,1],96]
         Pid: 0    Local rank: 8    Node rank: 8    App rank: 96
         State: INITIALIZED    App_context: 0
         Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..]
         Binding: [../../../../BB/../../..][../../../../../../../..]
     Data for proc: [[51718,1],97]
         Pid: 0    Local rank: 9    Node rank: 9    App rank: 97
         State: INITIALIZED    App_context: 0
         Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB]
         Binding: [../../../../../../../..][../../../../BB/../../..]
     Data for proc: [[51718,1],98]
         Pid: 0    Local rank: 10    Node rank: 10    App rank: 98
         State: INITIALIZED    App_context: 0
         Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..]
         Binding: [../../../../../BB/../..][../../../../../../../..]
     Data for proc: [[51718,1],99]
         Pid: 0    Local rank: 11    Node rank: 11    App rank: 99
         State: INITIALIZED    App_context: 0
         Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB]
         Binding: [../../../../../../../..][../../../../../BB/../..]
     Data for proc: [[51718,1],100]
         Pid: 0    Local rank: 12    Node rank: 12    App rank: 100
         State: INITIALIZED    App_context: 0
         Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..]
         Binding: [../../../../../../BB/..][../../../../../../../..]
     Data for proc: [[51718,1],101]
         Pid: 0    Local rank: 13    Node rank: 13    App rank: 101
         State: INITIALIZED    App_context: 0
         Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB]
         Binding: [../../../../../../../..][../../../../../../BB/..]
     Data for proc: [[51718,1],102]
         Pid: 0    Local rank: 14    Node rank: 14    App rank: 102
         State: INITIALIZED    App_context: 0
         Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..]
         Binding: [../../../../../../../BB][../../../../../../../..]
     Data for proc: [[51718,1],103]
         Pid: 0    Local rank: 15    Node rank: 15    App rank: 103
         State: INITIALIZED    App_context: 0
         Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB]
         Binding: [../../../../../../../..][../../../../../../../BB]

 Data for node: csclprd3-0-11         Launch id: -1    State: 0
     Daemon: [[51718,0],13]    Daemon launched: True
     Num slots: 16    Slots in use: 16    Oversubscribed: FALSE
     Num slots allocated: 16    Max slots: 0
     Username on node: NULL
     Num procs: 16    Next node_rank: 16
     Data for proc: [[51718,1],104]
         Pid: 0    Local rank: 0    Node rank: 0    App rank: 104
         State: INITIALIZED    App_context: 0
         Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..]
         Binding: [BB/../../../../../../..][../../../../../../../..]
     Data for proc: [[51718,1],105]
         Pid: 0    Local rank: 1    Node rank: 1    App rank: 105
         State: INITIALIZED    App_context: 0
         Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB]
         Binding: [../../../../../../../..][BB/../../../../../../..]
     Data for proc: [[51718,1],106]
         Pid: 0    Local rank: 2    Node rank: 2    App rank: 106
         State: INITIALIZED    App_context: 0
         Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..]
         Binding: [../BB/../../../../../..][../../../../../../../..]
     Data for proc: [[51718,1],107]
         Pid: 0    Local rank: 3    Node rank: 3    App rank: 107
         State: INITIALIZED    App_context: 0
         Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB]
         Binding: [../../../../../../../..][../BB/../../../../../..]
     Data for proc: [[51718,1],108]
         Pid: 0    Local rank: 4    Node rank: 4    App rank: 108
         State: INITIALIZED    App_context: 0
         Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..]
         Binding: [../../BB/../../../../..][../../../../../../../..]
     Data for proc: [[51718,1],109]
         Pid: 0    Local rank: 5    Node rank: 5    App rank: 109
         State: INITIALIZED    App_context: 0
         Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB]
         Binding: [../../../../../../../..][../../BB/../../../../..]
     Data for proc: [[51718,1],110]
         Pid: 0    Local rank: 6    Node rank: 6    App rank: 110
         State: INITIALIZED    App_context: 0
         Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..]
         Binding: [../../../BB/../../../..][../../../../../../../..]
     Data for proc: [[51718,1],111]
         Pid: 0    Local rank: 7    Node rank: 7    App rank: 111
         State: INITIALIZED    App_context: 0
         Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB]
         Binding: [../../../../../../../..][../../../BB/../../../..]
     Data for proc: [[51718,1],112]
         Pid: 0    Local rank: 8    Node rank: 8    App rank: 112
         State: INITIALIZED    App_context: 0
         Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..]
         Binding: [../../../../BB/../../..][../../../../../../../..]
     Data for proc: [[51718,1],113]
         Pid: 0    Local rank: 9    Node rank: 9    App rank: 113
         State: INITIALIZED    App_context: 0
         Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB]
         Binding: [../../../../../../../..][../../../../BB/../../..]
     Data for proc: [[51718,1],114]
         Pid: 0    Local rank: 10    Node rank: 10    App rank: 114
         State: INITIALIZED    App_context: 0
         Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..]
         Binding: [../../../../../BB/../..][../../../../../../../..]
     Data for proc: [[51718,1],115]
         Pid: 0    Local rank: 11    Node rank: 11    App rank: 115
         State: INITIALIZED    App_context: 0
         Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB]
         Binding: [../../../../../../../..][../../../../../BB/../..]
     Data for proc: [[51718,1],116]
         Pid: 0    Local rank: 12    Node rank: 12    App rank: 116
         State: INITIALIZED    App_context: 0
         Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..]
         Binding: [../../../../../../BB/..][../../../../../../../..]
     Data for proc: [[51718,1],117]
         Pid: 0    Local rank: 13    Node rank: 13    App rank: 117
         State: INITIALIZED    App_context: 0
         Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB]
         Binding: [../../../../../../../..][../../../../../../BB/..]
     Data for proc: [[51718,1],118]
         Pid: 0    Local rank: 14    Node rank: 14    App rank: 118
         State: INITIALIZED    App_context: 0
         Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..]
         Binding: [../../../../../../../BB][../../../../../../../..]
     Data for proc: [[51718,1],119]
         Pid: 0    Local rank: 15    Node rank: 15    App rank: 119
         State: INITIALIZED    App_context: 0
         Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB]
         Binding: [../../../../../../../..][../../../../../../../BB]

 Data for node: csclprd3-0-12         Launch id: -1    State: 0
     Daemon: [[51718,0],14]    Daemon launched: True
     Num slots: 6    Slots in use: 6    Oversubscribed: FALSE
     Num slots allocated: 6    Max slots: 0
     Username on node: NULL
     Num procs: 6    Next node_rank: 6
     Data for proc: [[51718,1],120]
         Pid: 0    Local rank: 0    Node rank: 0    App rank: 120
         State: INITIALIZED    App_context: 0
         Locale: UNKNOWN
         Binding: [BB/../../../../..]
     Data for proc: [[51718,1],121]
         Pid: 0    Local rank: 1    Node rank: 1    App rank: 121
         State: INITIALIZED    App_context: 0
         Locale: UNKNOWN
         Binding: [../BB/../../../..]
     Data for proc: [[51718,1],122]
         Pid: 0    Local rank: 2    Node rank: 2    App rank: 122
         State: INITIALIZED    App_context: 0
         Locale: UNKNOWN
         Binding: [../../BB/../../..]
     Data for proc: [[51718,1],123]
         Pid: 0    Local rank: 3    Node rank: 3    App rank: 123
         State: INITIALIZED    App_context: 0
         Locale: UNKNOWN
         Binding: [../../../BB/../..]
     Data for proc: [[51718,1],124]
         Pid: 0    Local rank: 4    Node rank: 4    App rank: 124
         State: INITIALIZED    App_context: 0
         Locale: UNKNOWN
         Binding: [../../../../BB/..]
     Data for proc: [[51718,1],125]
         Pid: 0    Local rank: 5    Node rank: 5    App rank: 125
         State: INITIALIZED    App_context: 0
         Locale: UNKNOWN
         Binding: [../../../../../BB]

 Data for node: csclprd3-0-13         Launch id: -1    State: 0
     Daemon: [[51718,0],15]    Daemon launched: True
     Num slots: 12    Slots in use: 6    Oversubscribed: FALSE
     Num slots allocated: 12    Max slots: 0
     Username on node: NULL
     Num procs: 6    Next node_rank: 6
     Data for proc: [[51718,1],126]
         Pid: 0    Local rank: 0    Node rank: 0    App rank: 126
         State: INITIALIZED    App_context: 0
         Locale: [BB/BB/BB/BB/BB/BB][../../../../../..]
         Binding: [BB/../../../../..][../../../../../..]
     Data for proc: [[51718,1],127]
         Pid: 0    Local rank: 1    Node rank: 1    App rank: 127
         State: INITIALIZED    App_context: 0
         Locale: [../../../../../..][BB/BB/BB/BB/BB/BB]
         Binding: [../../../../../..][BB/../../../../..]
     Data for proc: [[51718,1],128]
         Pid: 0    Local rank: 2    Node rank: 2    App rank: 128
         State: INITIALIZED    App_context: 0
         Locale: [BB/BB/BB/BB/BB/BB][../../../../../..]
         Binding: [../BB/../../../..][../../../../../..]
     Data for proc: [[51718,1],129]
         Pid: 0    Local rank: 3    Node rank: 3    App rank: 129
         State: INITIALIZED    App_context: 0
         Locale: [../../../../../..][BB/BB/BB/BB/BB/BB]
         Binding: [../../../../../..][../BB/../../../..]
     Data for proc: [[51718,1],130]
         Pid: 0    Local rank: 4    Node rank: 4    App rank: 130
         State: INITIALIZED    App_context: 0
         Locale: [BB/BB/BB/BB/BB/BB][../../../../../..]
         Binding: [../../BB/../../..][../../../../../..]
     Data for proc: [[51718,1],131]
         Pid: 0    Local rank: 5    Node rank: 5    App rank: 131
         State: INITIALIZED    App_context: 0
         Locale: [../../../../../..][BB/BB/BB/BB/BB/BB]
         Binding: [../../../../../..][../../BB/../../..]
[csclprd3-0-13:31619] *** Process received signal ***
[csclprd3-0-13:31619] Signal: Bus error (7)
[csclprd3-0-13:31619] Signal code: Non-existant physical address (2)
[csclprd3-0-13:31619] Failing at address: 0x7f1374267a00
[csclprd3-0-13:31620] *** Process received signal ***
[csclprd3-0-13:31620] Signal: Bus error (7)
[csclprd3-0-13:31620] Signal code: Non-existant physical address (2)
[csclprd3-0-13:31620] Failing at address: 0x7fcc702a7980
[csclprd3-0-13:31615] *** Process received signal ***
[csclprd3-0-13:31615] Signal: Bus error (7)
[csclprd3-0-13:31615] Signal code: Non-existant physical address (2)
[csclprd3-0-13:31615] Failing at address: 0x7f8128367880
[csclprd3-0-13:31616] *** Process received signal ***
[csclprd3-0-13:31616] Signal: Bus error (7)
[csclprd3-0-13:31616] Signal code: Non-existant physical address (2)
[csclprd3-0-13:31616] Failing at address: 0x7fe674227a00
[csclprd3-0-13:31617] *** Process received signal ***
[csclprd3-0-13:31617] Signal: Bus error (7)
[csclprd3-0-13:31617] Signal code: Non-existant physical address (2)
[csclprd3-0-13:31617] Failing at address: 0x7f061c32db80
[csclprd3-0-13:31618] *** Process received signal ***
[csclprd3-0-13:31618] Signal: Bus error (7)
[csclprd3-0-13:31618] Signal code: Non-existant physical address (2)
[csclprd3-0-13:31618] Failing at address: 0x7fb8402eaa80
[csclprd3-0-13:31618] [ 0] /lib64/libpthread.so.0(+0xf500)[0x7fb851851500]
[csclprd3-0-13:31618] [ 1] [csclprd3-0-13:31616] [ 0] 
/lib64/libpthread.so.0(+0xf500)[0x7fe6843a4500]
[csclprd3-0-13:31616] [ 1] [csclprd3-0-13:31620] [ 0] 
/lib64/libpthread.so.0(+0xf500)[0x7fcc80c54500]
[csclprd3-0-13:31620] [ 1] 
/hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x167f61)[0x7fcc80fc9f61]
[csclprd3-0-13:31620] [ 2] 
/hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x168047)[0x7fcc80fca047]
[csclprd3-0-13:31620] [ 3] [csclprd3-0-13:31615] [ 0] 
/lib64/libpthread.so.0(+0xf500)[0x7f81385ca500]
[csclprd3-0-13:31615] [ 1] 
/hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x167f61)[0x7f813893ff61]
[csclprd3-0-13:31615] [ 2] 
/hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x168047)[0x7f8138940047]
[csclprd3-0-13:31615] [ 3] 
/hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x167f61)[0x7fb851bc6f61]
[csclprd3-0-13:31618] [ 2] 
/hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x168047)[0x7fb851bc7047]
[csclprd3-0-13:31618] [ 3] 
/hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x55670)[0x7fb851ab4670]
[csclprd3-0-13:31618] [ 4] [csclprd3-0-13:31617] [ 0] 
/lib64/libpthread.so.0(+0xf500)[0x7f062cfe5500]
[csclprd3-0-13:31617] [ 1] 
/hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x167f61)[0x7f062d35af61]
[csclprd3-0-13:31617] [ 2] 
/hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x168047)[0x7f062d35b047]
[csclprd3-0-13:31617] [ 3] 
/hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x55670)[0x7f062d248670]
[csclprd3-0-13:31617] [ 4] [csclprd3-0-13:31619] [ 0] 
/lib64/libpthread.so.0(+0xf500)[0x7f1384fde500]
[csclprd3-0-13:31619] [ 1] 
/hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x167f61)[0x7f1385353f61]
[csclprd3-0-13:31619] [ 2] 
/hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x167f61)[0x7fe684719f61]
[csclprd3-0-13:31616] [ 2] 
/hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x168047)[0x7fe68471a047]
[csclprd3-0-13:31616] [ 3] 
/hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x55670)[0x7fe684607670]
[csclprd3-0-13:31616] [ 4] 
/hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x168047)[0x7f1385354047]
[csclprd3-0-13:31619] [ 3] 
/hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x55670)[0x7f1385241670]
[csclprd3-0-13:31619] [ 4] 
/hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(ompi_free_list_grow+0x3b9)[0x7f13852425ab]
[csclprd3-0-13:31619] [ 5] 
/hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(ompi_free_list_resize_mt+0xfb)[0x7f1385242751]
[csclprd3-0-13:31619] [ 6] 
/hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(mca_btl_sm_add_procs+0x671)[0x7f13853501c9]
[csclprd3-0-13:31619] [ 7] 
/hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x14a628)[0x7f1385336628]
[csclprd3-0-13:31619] [ 8] 
/hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x55670)[0x7fcc80eb7670]
[csclprd3-0-13:31620] [ 4] 
/hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(ompi_free_list_grow+0x3b9)[0x7fcc80eb85ab]
[csclprd3-0-13:31620] [ 5] 
/hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(ompi_free_list_resize_mt+0xfb)[0x7fcc80eb8751]
[csclprd3-0-13:31620] [ 6] 
/hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(mca_btl_sm_add_procs+0x671)[0x7fcc80fc61c9]
[csclprd3-0-13:31620] [ 7] 
/hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x14a628)[0x7fcc80fac628]
[csclprd3-0-13:31620] [ 8] 
/hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(mca_pml_ob1_add_procs+0xff)[0x7fcc8111fd61]
[csclprd3-0-13:31620] [ 9] 
/hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x55670)[0x7f813882d670]
[csclprd3-0-13:31615] [ 4] 
/hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(ompi_free_list_grow+0x3b9)[0x7f813882e5ab]
[csclprd3-0-13:31615] [ 5] 
/hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(ompi_free_list_resize_mt+0xfb)[0x7f813882e751]
[csclprd3-0-13:31615] [ 6] 
/hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(mca_btl_sm_add_procs+0x671)[0x7f813893c1c9]
[csclprd3-0-13:31615] [ 7] 
/hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x14a628)[0x7f8138922628]
[csclprd3-0-13:31615] [ 8] 
/hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(mca_pml_ob1_add_procs+0xff)[0x7f8138a95d61]
[csclprd3-0-13:31615] [ 9] 
/hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(ompi_mpi_init+0xbda)[0x7f813885d747]
[csclprd3-0-13:31615] [10] 
/hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(ompi_free_list_grow+0x3b9)[0x7fb851ab55ab]
[csclprd3-0-13:31618] [ 5] 
/hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(ompi_free_list_resize_mt+0xfb)[0x7fb851ab5751]
[csclprd3-0-13:31618] [ 6] 
/hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(mca_btl_sm_add_procs+0x671)[0x7fb851bc31c9]
[csclprd3-0-13:31618] [ 7] 
/hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x14a628)[0x7fb851ba9628]
[csclprd3-0-13:31618] [ 8] 
/hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(mca_pml_ob1_add_procs+0xff)[0x7fb851d1cd61]
[csclprd3-0-13:31618] [ 9] 
/hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(ompi_mpi_init+0xbda)[0x7fb851ae4747]
[csclprd3-0-13:31618] [10] 
/hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(ompi_free_list_grow+0x3b9)[0x7f062d2495ab]
[csclprd3-0-13:31617] [ 5] 
/hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(ompi_free_list_resize_mt+0xfb)[0x7f062d249751]
[csclprd3-0-13:31617] [ 6] 
/hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(mca_btl_sm_add_procs+0x671)[0x7f062d3571c9]
[csclprd3-0-13:31617] [ 7] 
/hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x14a628)[0x7f062d33d628]
[csclprd3-0-13:31617] [ 8] 
/hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(mca_pml_ob1_add_procs+0xff)[0x7f062d4b0d61]
[csclprd3-0-13:31617] [ 9] 
/hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(ompi_mpi_init+0xbda)[0x7f062d278747]
[csclprd3-0-13:31617] [10] 
/hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(ompi_free_list_grow+0x3b9)[0x7fe6846085ab]
[csclprd3-0-13:31616] [ 5] 
/hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(ompi_free_list_resize_mt+0xfb)[0x7fe684608751]
[csclprd3-0-13:31616] [ 6] 
/hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(mca_btl_sm_add_procs+0x671)[0x7fe6847161c9]
[csclprd3-0-13:31616] [ 7] 
/hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x14a628)[0x7fe6846fc628]
[csclprd3-0-13:31616] [ 8] 
/hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(mca_pml_ob1_add_procs+0xff)[0x7fe68486fd61]
[csclprd3-0-13:31616] [ 9] 
/hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(ompi_mpi_init+0xbda)[0x7fe684637747]
[csclprd3-0-13:31616] [10] 
/hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(MPI_Init+0x185)[0x7fe68467750b]
[csclprd3-0-13:31616] [11] /hpc/home/lanew/mpi/openmpi/ProcessColors3[0x400ad0]
[csclprd3-0-13:31616] [12] 
/lib64/libc.so.6(__libc_start_main+0xfd)[0x7fe684021cdd]
[csclprd3-0-13:31616] [13] /hpc/home/lanew/mpi/openmpi/ProcessColors3[0x400999]
[csclprd3-0-13:31616] *** End of error message ***
/hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(MPI_Init+0x185)[0x7f062d2b850b]
[csclprd3-0-13:31617] [11] /hpc/home/lanew/mpi/openmpi/ProcessColors3[0x400ad0]
[csclprd3-0-13:31617] [12] 
/lib64/libc.so.6(__libc_start_main+0xfd)[0x7f062cc62cdd]
[csclprd3-0-13:31617] [13] /hpc/home/lanew/mpi/openmpi/ProcessColors3[0x400999]
[csclprd3-0-13:31617] *** End of error message ***
/hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(mca_pml_ob1_add_procs+0xff)[0x7f13854a9d61]
[csclprd3-0-13:31619] [ 9] 
/hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(ompi_mpi_init+0xbda)[0x7f1385271747]
[csclprd3-0-13:31619] [10] 
/hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(MPI_Init+0x185)[0x7f13852b150b]
[csclprd3-0-13:31619] [11] /hpc/home/lanew/mpi/openmpi/ProcessColors3[0x400ad0]
[csclprd3-0-13:31619] [12] 
/lib64/libc.so.6(__libc_start_main+0xfd)[0x7f1384c5bcdd]
[csclprd3-0-13:31619] [13] /hpc/home/lanew/mpi/openmpi/ProcessColors3[0x400999]
[csclprd3-0-13:31619] *** End of error message ***
/hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(ompi_mpi_init+0xbda)[0x7fcc80ee7747]
[csclprd3-0-13:31620] [10] 
/hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(MPI_Init+0x185)[0x7fcc80f2750b]
[csclprd3-0-13:31620] [11] /hpc/home/lanew/mpi/openmpi/ProcessColors3[0x400ad0]
[csclprd3-0-13:31620] [12] 
/lib64/libc.so.6(__libc_start_main+0xfd)[0x7fcc808d1cdd]
[csclprd3-0-13:31620] [13] /hpc/home/lanew/mpi/openmpi/ProcessColors3[0x400999]
[csclprd3-0-13:31620] *** End of error message ***
/hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(MPI_Init+0x185)[0x7f813889d50b]
[csclprd3-0-13:31615] [11] /hpc/home/lanew/mpi/openmpi/ProcessColors3[0x400ad0]
[csclprd3-0-13:31615] [12] 
/lib64/libc.so.6(__libc_start_main+0xfd)[0x7f8138247cdd]
[csclprd3-0-13:31615] [13] /hpc/home/lanew/mpi/openmpi/ProcessColors3[0x400999]
[csclprd3-0-13:31615] *** End of error message ***
/hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(MPI_Init+0x185)[0x7fb851b2450b]
[csclprd3-0-13:31618] [11] /hpc/home/lanew/mpi/openmpi/ProcessColors3[0x400ad0]
[csclprd3-0-13:31618] [12] 
/lib64/libc.so.6(__libc_start_main+0xfd)[0x7fb8514cecdd]
[csclprd3-0-13:31618] [13] /hpc/home/lanew/mpi/openmpi/ProcessColors3[0x400999]
[csclprd3-0-13:31618] *** End of error message ***
--------------------------------------------------------------------------
mpirun noticed that process rank 126 with PID 0 on node csclprd3-0-13 exited on 
signal 7 (Bus error).
--------------------------------------------------------------------------

________________________________
From: users [users-boun...@open-mpi.org] on behalf of Ralph Castain 
[r...@open-mpi.org]
Sent: Tuesday, June 23, 2015 6:20 PM
To: Open MPI Users
Subject: Re: [OMPI users] OpenMPI 1.8.6, CentOS 6.3, too many slots = crash

Wow - that is one sick puppy! I see that some nodes are reporting not-bound for 
their procs, and the rest are binding to socket (as they should). Some of your 
nodes clearly do not have hyper threads enabled (or only have single-thread 
cores on them), and have 2 cores/socket. Other nodes have 8 cores/socket with 
hyper threads enabled, while still others have 6 cores/socket and HT enabled.

I don't see anyone binding to a single HT if multiple HTs/core are available. I 
think you are being fooled by those nodes that either don't have HT enabled, or 
have only 1 HT/core.

In both cases, it is node 13 that is the node that fails. I also note that you 
said everything works okay with < 132 ranks, and node 13 hosts ranks 127-131. 
So node 13 would host ranks even if you reduced the number in the job to 131. 
This would imply that it probably isn't something wrong with the node itself.

Is there any way you could run a job of this size on a homogeneous cluster? The 
procs all show bindings that look right, but I'm wondering if the heterogeneity 
is the source of the trouble here. We may be communicating the binding pattern 
incorrectly and giving bad info to the backend daemon.

Also, rather than --report-bindings, use "--display-devel-map" on the command 
line and let's see what the mapper thinks it did. If there is a problem with 
placement, that is where it would exist.


On Tue, Jun 23, 2015 at 5:12 PM, Lane, William <william.l...@cshs.org> wrote:
Ralph,

There is something funny going on, the trace from the
runs w/the debug build aren't showing any differences from
what I got earlier. However, I did do a run w/the --bind-to core
switch and was surprised to see that hyperthreading cores were
sometimes being used.

Here's the traces that I have:

mpirun -np 132 -report-bindings --prefix /hpc/apps/mpi/openmpi/1.8.6/ 
--hostfile hostfile-noslots --mca btl_tcp_if_include eth0 --hetero-nodes 
/hpc/home/lanew/mpi/openmpi/ProcessColors3
[csclprd3-0-5:16802] MCW rank 44 is not bound (or bound to all available 
processors)
[csclprd3-0-5:16802] MCW rank 45 is not bound (or bound to all available 
processors)
[csclprd3-0-5:16802] MCW rank 46 is not bound (or bound to all available 
processors)
[csclprd3-6-5:12480] MCW rank 4 bound to socket 0[core 0[hwt 0]], socket 0[core 
1[hwt 0]]: [B/B][./.]
[csclprd3-6-5:12480] MCW rank 5 bound to socket 1[core 2[hwt 0]], socket 1[core 
3[hwt 0]]: [./.][B/B]
[csclprd3-6-5:12480] MCW rank 6 bound to socket 0[core 0[hwt 0]], socket 0[core 
1[hwt 0]]: [B/B][./.]
[csclprd3-6-5:12480] MCW rank 7 bound to socket 1[core 2[hwt 0]], socket 1[core 
3[hwt 0]]: [./.][B/B]
[csclprd3-0-5:16802] MCW rank 47 is not bound (or bound to all available 
processors)
[csclprd3-0-5:16802] MCW rank 48 is not bound (or bound to all available 
processors)
[csclprd3-0-5:16802] MCW rank 49 is not bound (or bound to all available 
processors)
[csclprd3-0-1:14318] MCW rank 22 is not bound (or bound to all available 
processors)
[csclprd3-0-1:14318] MCW rank 23 is not bound (or bound to all available 
processors)
[csclprd3-0-1:14318] MCW rank 24 is not bound (or bound to all available 
processors)
[csclprd3-6-1:24682] MCW rank 3 bound to socket 1[core 2[hwt 0]], socket 1[core 
3[hwt 0]]: [./.][B/B]
[csclprd3-6-1:24682] MCW rank 0 bound to socket 0[core 0[hwt 0]], socket 0[core 
1[hwt 0]]: [B/B][./.]
[csclprd3-0-1:14318] MCW rank 25 is not bound (or bound to all available 
processors)
[csclprd3-0-1:14318] MCW rank 20 is not bound (or bound to all available 
processors)
[csclprd3-0-3:13827] MCW rank 34 is not bound (or bound to all available 
processors)
[csclprd3-0-1:14318] MCW rank 21 is not bound (or bound to all available 
processors)
[csclprd3-0-3:13827] MCW rank 35 is not bound (or bound to all available 
processors)
[csclprd3-6-1:24682] MCW rank 1 bound to socket 1[core 2[hwt 0]], socket 1[core 
3[hwt 0]]: [./.][B/B]
[csclprd3-0-3:13827] MCW rank 36 is not bound (or bound to all available 
processors)
[csclprd3-6-1:24682] MCW rank 2 bound to socket 0[core 0[hwt 0]], socket 0[core 
1[hwt 0]]: [B/B][./.]
[csclprd3-0-6:30371] MCW rank 51 is not bound (or bound to all available 
processors)
[csclprd3-0-6:30371] MCW rank 52 is not bound (or bound to all available 
processors)
[csclprd3-0-6:30371] MCW rank 53 is not bound (or bound to all available 
processors)
[csclprd3-0-2:05825] MCW rank 30 is not bound (or bound to all available 
processors)
[csclprd3-0-6:30371] MCW rank 54 is not bound (or bound to all available 
processors)
[csclprd3-0-3:13827] MCW rank 37 is not bound (or bound to all available 
processors)
[csclprd3-0-2:05825] MCW rank 31 is not bound (or bound to all available 
processors)
[csclprd3-0-3:13827] MCW rank 32 is not bound (or bound to all available 
processors)
[csclprd3-0-6:30371] MCW rank 55 is not bound (or bound to all available 
processors)
[csclprd3-0-3:13827] MCW rank 33 is not bound (or bound to all available 
processors)
[csclprd3-0-6:30371] MCW rank 50 is not bound (or bound to all available 
processors)
[csclprd3-0-2:05825] MCW rank 26 is not bound (or bound to all available 
processors)
[csclprd3-0-2:05825] MCW rank 27 is not bound (or bound to all available 
processors)
[csclprd3-0-2:05825] MCW rank 28 is not bound (or bound to all available 
processors)
[csclprd3-0-2:05825] MCW rank 29 is not bound (or bound to all available 
processors)
[csclprd3-0-12:12383] MCW rank 121 is not bound (or bound to all available 
processors)
[csclprd3-0-12:12383] MCW rank 122 is not bound (or bound to all available 
processors)
[csclprd3-0-12:12383] MCW rank 123 is not bound (or bound to all available 
processors)
[csclprd3-0-12:12383] MCW rank 124 is not bound (or bound to all available 
processors)
[csclprd3-0-12:12383] MCW rank 125 is not bound (or bound to all available 
processors)
[csclprd3-0-12:12383] MCW rank 120 is not bound (or bound to all available 
processors)
[csclprd3-0-0:31079] MCW rank 13 bound to socket 1[core 6[hwt 0]], socket 
1[core 7[hwt 0]], socket 1[core 8[hwt 0]], socket 1[core 9[hwt 0]], socket 
1[core 10[hwt 0]], socket 1[core 11[hwt 0]]: [./././././.][B/B/B/B/B/B]
[csclprd3-0-0:31079] MCW rank 14 bound to socket 0[core 0[hwt 0]], socket 
0[core 1[hwt 0]], socket 0[core 2[hwt 0]], socket 0[core 3[hwt 0]], socket 
0[core 4[hwt 0]], socket 0[core 5[hwt 0]]: [B/B/B/B/B/B][./././././.]
[csclprd3-0-0:31079] MCW rank 15 bound to socket 1[core 6[hwt 0]], socket 
1[core 7[hwt 0]], socket 1[core 8[hwt 0]], socket 1[core 9[hwt 0]], socket 
1[core 10[hwt 0]], socket 1[core 11[hwt 0]]: [./././././.][B/B/B/B/B/B]
[csclprd3-0-0:31079] MCW rank 16 bound to socket 0[core 0[hwt 0]], socket 
0[core 1[hwt 0]], socket 0[core 2[hwt 0]], socket 0[core 3[hwt 0]], socket 
0[core 4[hwt 0]], socket 0[core 5[hwt 0]]: [B/B/B/B/B/B][./././././.]
[csclprd3-0-7:20515] MCW rank 68 bound to socket 0[core 0[hwt 0-1]], socket 
0[core 1[hwt 0-1]], socket 0[core 2[hwt 0-1]], socket 0[core 3[hwt 0-1]], 
socket 0[core 4[hwt 0-1]], socket 0[core 5[hwt 0-1]], socket 0[core 6[hwt 
0-1]], socket 0[core 7[hwt 0-1]]: 
[BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..]
[csclprd3-0-10:19096] MCW rank 100 bound to socket 0[core 0[hwt 0-1]], socket 
0[core 1[hwt 0-1]], socket 0[core 2[hwt 0-1]], socket 0[core 3[hwt 0-1]], 
socket 0[core 4[hwt 0-1]], socket 0[core 5[hwt 0-1]], socket 0[core 6[hwt 
0-1]], socket 0[core 7[hwt 0-1]]: 
[BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..]
[csclprd3-0-7:20515] MCW rank 69 bound to socket 1[core 8[hwt 0-1]], socket 
1[core 9[hwt 0-1]], socket 1[core 10[hwt 0-1]], socket 1[core 11[hwt 0-1]], 
socket 1[core 12[hwt 0-1]], socket 1[core 13[hwt 0-1]], socket 1[core 14[hwt 
0-1]], socket 1[core 15[hwt 0-1]]: 
[../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB]
[csclprd3-0-10:19096] MCW rank 101 bound to socket 1[core 8[hwt 0-1]], socket 
1[core 9[hwt 0-1]], socket 1[core 10[hwt 0-1]], socket 1[core 11[hwt 0-1]], 
socket 1[core 12[hwt 0-1]], socket 1[core 13[hwt 0-1]], socket 1[core 14[hwt 
0-1]], socket 1[core 15[hwt 0-1]]: 
[../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB]
[csclprd3-0-0:31079] MCW rank 17 bound to socket 1[core 6[hwt 0]], socket 
1[core 7[hwt 0]], socket 1[core 8[hwt 0]], socket 1[core 9[hwt 0]], socket 
1[core 10[hwt 0]], socket 1[core 11[hwt 0]]: [./././././.][B/B/B/B/B/B]
[csclprd3-0-7:20515] MCW rank 70 bound to socket 0[core 0[hwt 0-1]], socket 
0[core 1[hwt 0-1]], socket 0[core 2[hwt 0-1]], socket 0[core 3[hwt 0-1]], 
socket 0[core 4[hwt 0-1]], socket 0[core 5[hwt 0-1]], socket 0[core 6[hwt 
0-1]], socket 0[core 7[hwt 0-1]]: 
[BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..]
[csclprd3-0-10:19096] MCW rank 102 bound to socket 0[core 0[hwt 0-1]], socket 
0[core 1[hwt 0-1]], socket 0[core 2[hwt 0-1]], socket 0[core 3[hwt 0-1]], 
socket 0[core 4[hwt 0-1]], socket 0[core 5[hwt 0-1]], socket 0[core 6[hwt 
0-1]], socket 0[core 7[hwt 0-1]]: 
[BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..]
[csclprd3-0-11:31636] MCW rank 116 bound to socket 0[core 0[hwt 0-1]], socket 
0[core 1[hwt 0-1]], socket 0[core 2[hwt 0-1]], socket 0[core 3[hwt 0-1]], 
socket 0[core 4[hwt 0-1]], socket 0[core 5[hwt 0-1]], socket 0[core 6[hwt 
0-1]], socket 0[core 7[hwt 0-1]]: 
[BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..]
[csclprd3-0-11:31636] MCW rank 117 bound to socket 1[core 8[hwt 0-1]], socket 
1[core 9[hwt 0-1]], socket 1[core 10[hwt 0-1]], socket 1[core 11[hwt 0-1]], 
socket 1[core 12[hwt 0-1]], socket 1[core 13[hwt 0-1]], socket 1[core 14[hwt 
0-1]], socket 1[core 15[hwt 0-1]]: 
[../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB]
[csclprd3-0-0:31079] MCW rank 18 bound to socket 0[core 0[hwt 0]], socket 
0[core 1[hwt 0]], socket 0[core 2[hwt 0]], socket 0[core 3[hwt 0]], socket 
0[core 4[hwt 0]], socket 0[core 5[hwt 0]]: [B/B/B/B/B/B][./././././.]
[csclprd3-0-11:31636] MCW rank 118 bound to socket 0[core 0[hwt 0-1]], socket 
0[core 1[hwt 0-1]], socket 0[core 2[hwt 0-1]], socket 0[core 3[hwt 0-1]], 
socket 0[core 4[hwt 0-1]], socket 0[core 5[hwt 0-1]], socket 0[core 6[hwt 
0-1]], socket 0[core 7[hwt 0-1]]: 
[BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..]
[csclprd3-0-0:31079] MCW rank 19 bound to socket 1[core 6[hwt 0]], socket 
1[core 7[hwt 0]], socket 1[core 8[hwt 0]], socket 1[core 9[hwt 0]], socket 
1[core 10[hwt 0]], socket 1[core 11[hwt 0]]: [./././././.][B/B/B/B/B/B]
[csclprd3-0-7:20515] MCW rank 71 bound to socket 1[core 8[hwt 0-1]], socket 
1[core 9[hwt 0-1]], socket 1[core 10[hwt 0-1]], socket 1[core 11[hwt 0-1]], 
socket 1[core 12[hwt 0-1]], socket 1[core 13[hwt 0-1]], socket 1[core 14[hwt 
0-1]], socket 1[core 15[hwt 0-1]]: 
[../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB]
[csclprd3-0-10:19096] MCW rank 103 bound to socket 1[core 8[hwt 0-1]], socket 
1[core 9[hwt 0-1]], socket 1[core 10[hwt 0-1]], socket 1[core 11[hwt 0-1]], 
socket 1[core 12[hwt 0-1]], socket 1[core 13[hwt 0-1]], socket 1[core 14[hwt 
0-1]], socket 1[core 15[hwt 0-1]]: 
[../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB]
[csclprd3-0-0:31079] MCW rank 8 bound to socket 0[core 0[hwt 0]], socket 0[core 
1[hwt 0]], socket 0[core 2[hwt 0]], socket 0[core 3[hwt 0]], socket 0[core 
4[hwt 0]], socket 0[core 5[hwt 0]]: [B/B/B/B/B/B][./././././.]
[csclprd3-0-0:31079] MCW rank 9 bound to socket 1[core 6[hwt 0]], socket 1[core 
7[hwt 0]], socket 1[core 8[hwt 0]], socket 1[core 9[hwt 0]], socket 1[core 
10[hwt 0]], socket 1[core 11[hwt 0]]: [./././././.][B/B/B/B/B/B]
[csclprd3-0-10:19096] MCW rank 88 bound to socket 0[core 0[hwt 0-1]], socket 
0[core 1[hwt 0-1]], socket 0[core 2[hwt 0-1]], socket 0[core 3[hwt 0-1]], 
socket 0[core 4[hwt 0-1]], socket 0[core 5[hwt 0-1]], socket 0[core 6[hwt 
0-1]], socket 0[core 7[hwt 0-1]]: 
[BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..]
[csclprd3-0-11:31636] MCW rank 119 bound to socket 1[core 8[hwt 0-1]], socket 
1[core 9[hwt 0-1]], socket 1[core 10[hwt 0-1]], socket 1[core 11[hwt 0-1]], 
socket 1[core 12[hwt 0-1]], socket 1[core 13[hwt 0-1]], socket 1[core 14[hwt 
0-1]], socket 1[core 15[hwt 0-1]]: 
[../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB]
[csclprd3-0-7:20515] MCW rank 56 bound to socket 0[core 0[hwt 0-1]], socket 
0[core 1[hwt 0-1]], socket 0[core 2[hwt 0-1]], socket 0[core 3[hwt 0-1]], 
socket 0[core 4[hwt 0-1]], socket 0[core 5[hwt 0-1]], socket 0[core 6[hwt 
0-1]], socket 0[core 7[hwt 0-1]]: 
[BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..]
[csclprd3-0-0:31079] MCW rank 10 bound to socket 0[core 0[hwt 0]], socket 
0[core 1[hwt 0]], socket 0[core 2[hwt 0]], socket 0[core 3[hwt 0]], socket 
0[core 4[hwt 0]], socket 0[core 5[hwt 0]]: [B/B/B/B/B/B][./././././.]
[csclprd3-0-7:20515] MCW rank 57 bound to socket 1[core 8[hwt 0-1]], socket 
1[core 9[hwt 0-1]], socket 1[core 10[hwt 0-1]], socket 1[core 11[hwt 0-1]], 
socket 1[core 12[hwt 0-1]], socket 1[core 13[hwt 0-1]], socket 1[core 14[hwt 
0-1]], socket 1[core 15[hwt 0-1]]: 
[../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB]
[csclprd3-0-10:19096] MCW rank 89 bound to socket 1[core 8[hwt 0-1]], socket 
1[core 9[hwt 0-1]], socket 1[core 10[hwt 0-1]], socket 1[core 11[hwt 0-1]], 
socket 1[core 12[hwt 0-1]], socket 1[core 13[hwt 0-1]], socket 1[core 14[hwt 
0-1]], socket 1[core 15[hwt 0-1]]: 
[../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB]
[csclprd3-0-11:31636] MCW rank 104 bound to socket 0[core 0[hwt 0-1]], socket 
0[core 1[hwt 0-1]], socket 0[core 2[hwt 0-1]], socket 0[core 3[hwt 0-1]], 
socket 0[core 4[hwt 0-1]], socket 0[core 5[hwt 0-1]], socket 0[core 6[hwt 
0-1]], socket 0[core 7[hwt 0-1]]: 
[BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..]
[csclprd3-0-0:31079] MCW rank 11 bound to socket 1[core 6[hwt 0]], socket 
1[core 7[hwt 0]], socket 1[core 8[hwt 0]], socket 1[core 9[hwt 0]], socket 
1[core 10[hwt 0]], socket 1[core 11[hwt 0]]: [./././././.][B/B/B/B/B/B]
[csclprd3-0-0:31079] MCW rank 12 bound to socket 0[core 0[hwt 0]], socket 
0[core 1[hwt 0]], socket 0[core 2[hwt 0]], socket 0[core 3[hwt 0]], socket 
0[core 4[hwt 0]], socket 0[core 5[hwt 0]]: [B/B/B/B/B/B][./././././.]
[csclprd3-0-4:30348] MCW rank 42 is not bound (or bound to all

_______________________________________________
users mailing list
us...@open-mpi.org<mailto:us...@open-mpi.org>
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post: 
http://www.open-mpi.org/community/lists/users/2015/06/27185.php

IMPORTANT WARNING: This message is intended for the use of the person or entity 
to which it is addressed and may contain information that is privileged and 
confidential, the disclosure of which is governed by applicable law. If the 
reader of this message is not the intended recipient, or the employee or agent 
responsible for delivering it to the intended recipient, you are hereby 
notified that any dissemination, distribution or copying of this information is 
strictly prohibited. Thank you for your cooperation.
_______________________________________________
users mailing list
us...@open-mpi.org<mailto:us...@open-mpi.org>
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post: 
http://www.open-mpi.org/community/lists/users/2015/06/27204.php

IMPORTANT WARNING: This message is intended for the use of the person or entity 
to which it is addressed and may contain information that is privileged and 
confidential, the disclosure of which is governed by applicable law. If the 
reader of this message is not the intended recipient, or the employee or agent 
responsible for delivering it to the intended recipient, you are hereby 
notified that any dissemination, distribution or copying of this information is 
strictly prohibited. Thank you for your cooperation.

Reply via email to