Bill,

were you able to get a core file and analyze the stack with gdb ?

I suspect the error occurs in mca_btl_sm_add_procs but this is just my best
guess.
if this is correct, can you check the value of
mca_btl_sm_component.num_smp_procs ?

as a workaround, can you try
mpirun --mca btl ^sm ...

I do not see how I can tackle the root cause without being able to
reproduce the issue :-(

can you try to reproduce the issue with the smallest hostfile, and then run
lstopo on all the nodes ?
btw, you are not mixing 32 bits and 64 bits OS, are you ?

Cheers,

Gilles



mca_btl_sm_add_procs(

int mca_btl_sm_add_procs(On Wednesday, June 24, 2015, Lane, William <
william.l...@cshs.org> wrote:

>  Gilles,
>
> All the blades only have two core Xeons (without hyperthreading)
> populating both their sockets. All
> the x3550 nodes have hyperthreading capable Xeons and Sandybridge server
> CPU's. It's possible
> hyperthreading has been disabled on some of these nodes though. The 3-0-n
> nodes are all IBM x3550
> nodes while the 3-6-n nodes are all blade nodes.
>
> I have run this exact same test code successfully in the past on another
> cluster (~200 nodes of Sunfire X2100
> 2x dual-core Opterons) w/no issues on upwards of 390 slots. I even tested
> it recently on OpenMPI 1.8.5
> on another smaller R&D cluster consisting of 10 Sunfire X2100 nodes (w/2
> dual core Opterons apiece).
> On this particular cluster I've had success running this code on < 132
> slots.
>
> Anyway, here's the results of the following mpirun:
>
> mpirun -np 132 -display-devel-map --prefix /hpc/apps/mpi/openmpi/1.8.6/
> --hostfile hostfile-noslots --mca btl_tcp_if_include eth0 --hetero-nodes
> --bind-to core /hpc/home/lanew/mpi/openmpi/ProcessColors3 >> out.txt 2>&1
>
> --------------------------------------------------------------------------
> WARNING: a request was made to bind a process. While the system
> supports binding the process itself, at least one node does NOT
> support binding memory to the process location.
>
>   Node:  csclprd3-6-1
>
> This usually is due to not having the required NUMA support installed
> on the node. In some Linux distributions, the required support is
> contained in the libnumactl and libnumactl-devel packages.
> This is a warning only; your job will continue, though performance may be
> degraded.
> --------------------------------------------------------------------------
>  Data for JOB [51718,1] offset 0
>
>  Mapper requested: NULL  Last mapper: round_robin  Mapping policy:
> BYSOCKET  Ranking policy: SLOT
>  Binding policy: CORE  Cpu set: NULL  PPR: NULL  Cpus-per-rank: 1
>      Num new daemons: 0    New daemon starting vpid INVALID
>      Num nodes: 15
>
>  Data for node: csclprd3-6-1         Launch id: -1    State: 0
>      Daemon: [[51718,0],1]    Daemon launched: True
>      Num slots: 4    Slots in use: 4    Oversubscribed: FALSE
>      Num slots allocated: 4    Max slots: 0
>      Username on node: NULL
>      Num procs: 4    Next node_rank: 4
>      Data for proc: [[51718,1],0]
>          Pid: 0    Local rank: 0    Node rank: 0    App rank: 0
>          State: INITIALIZED    App_context: 0
>          Locale: [B/B][./.]
>          Binding: [B/.][./.]
>      Data for proc: [[51718,1],1]
>          Pid: 0    Local rank: 1    Node rank: 1    App rank: 1
>          State: INITIALIZED    App_context: 0
>          Locale: [./.][B/B]
>          Binding: [./.][B/.]
>      Data for proc: [[51718,1],2]
>          Pid: 0    Local rank: 2    Node rank: 2    App rank: 2
>          State: INITIALIZED    App_context: 0
>          Locale: [B/B][./.]
>          Binding: [./B][./.]
>      Data for proc: [[51718,1],3]
>          Pid: 0    Local rank: 3    Node rank: 3    App rank: 3
>          State: INITIALIZED    App_context: 0
>          Locale: [./.][B/B]
>          Binding: [./.][./B]
>
>  Data for node: csclprd3-6-5         Launch id: -1    State: 0
>      Daemon: [[51718,0],2]    Daemon launched: True
>      Num slots: 4    Slots in use: 4    Oversubscribed: FALSE
>      Num slots allocated: 4    Max slots: 0
>      Username on node: NULL
>      Num procs: 4    Next node_rank: 4
>      Data for proc: [[51718,1],4]
>          Pid: 0    Local rank: 0    Node rank: 0    App rank: 4
>          State: INITIALIZED    App_context: 0
>          Locale: [B/B][./.]
>          Binding: [B/.][./.]
>      Data for proc: [[51718,1],5]
>          Pid: 0    Local rank: 1    Node rank: 1    App rank: 5
>          State: INITIALIZED    App_context: 0
>          Locale: [./.][B/B]
>          Binding: [./.][B/.]
>      Data for proc: [[51718,1],6]
>          Pid: 0    Local rank: 2    Node rank: 2    App rank: 6
>          State: INITIALIZED    App_context: 0
>          Locale: [B/B][./.]
>          Binding: [./B][./.]
>      Data for proc: [[51718,1],7]
>          Pid: 0    Local rank: 3    Node rank: 3    App rank: 7
>          State: INITIALIZED    App_context: 0
>          Locale: [./.][B/B]
>          Binding: [./.][./B]
>
>  Data for node: csclprd3-0-0         Launch id: -1    State: 0
>      Daemon: [[51718,0],3]    Daemon launched: True
>      Num slots: 12    Slots in use: 12    Oversubscribed: FALSE
>      Num slots allocated: 12    Max slots: 0
>      Username on node: NULL
>      Num procs: 12    Next node_rank: 12
>      Data for proc: [[51718,1],8]
>          Pid: 0    Local rank: 0    Node rank: 0    App rank: 8
>          State: INITIALIZED    App_context: 0
>          Locale: [B/B/B/B/B/B][./././././.]
>          Binding: [B/././././.][./././././.]
>      Data for proc: [[51718,1],9]
>          Pid: 0    Local rank: 1    Node rank: 1    App rank: 9
>          State: INITIALIZED    App_context: 0
>          Locale: [./././././.][B/B/B/B/B/B]
>          Binding: [./././././.][B/././././.]
>      Data for proc: [[51718,1],10]
>          Pid: 0    Local rank: 2    Node rank: 2    App rank: 10
>          State: INITIALIZED    App_context: 0
>          Locale: [B/B/B/B/B/B][./././././.]
>          Binding: [./B/./././.][./././././.]
>      Data for proc: [[51718,1],11]
>          Pid: 0    Local rank: 3    Node rank: 3    App rank: 11
>          State: INITIALIZED    App_context: 0
>          Locale: [./././././.][B/B/B/B/B/B]
>          Binding: [./././././.][./B/./././.]
>      Data for proc: [[51718,1],12]
>          Pid: 0    Local rank: 4    Node rank: 4    App rank: 12
>          State: INITIALIZED    App_context: 0
>          Locale: [B/B/B/B/B/B][./././././.]
>          Binding: [././B/././.][./././././.]
>      Data for proc: [[51718,1],13]
>          Pid: 0    Local rank: 5    Node rank: 5    App rank: 13
>          State: INITIALIZED    App_context: 0
>          Locale: [./././././.][B/B/B/B/B/B]
>          Binding: [./././././.][././B/././.]
>      Data for proc: [[51718,1],14]
>          Pid: 0    Local rank: 6    Node rank: 6    App rank: 14
>          State: INITIALIZED    App_context: 0
>          Locale: [B/B/B/B/B/B][./././././.]
>          Binding: [./././B/./.][./././././.]
>      Data for proc: [[51718,1],15]
>          Pid: 0    Local rank: 7    Node rank: 7    App rank: 15
>          State: INITIALIZED    App_context: 0
>          Locale: [./././././.][B/B/B/B/B/B]
>          Binding: [./././././.][./././B/./.]
>      Data for proc: [[51718,1],16]
>          Pid: 0    Local rank: 8    Node rank: 8    App rank: 16
>          State: INITIALIZED    App_context: 0
>          Locale: [B/B/B/B/B/B][./././././.]
>          Binding: [././././B/.][./././././.]
>      Data for proc: [[51718,1],17]
>          Pid: 0    Local rank: 9    Node rank: 9    App rank: 17
>          State: INITIALIZED    App_context: 0
>          Locale: [./././././.][B/B/B/B/B/B]
>          Binding: [./././././.][././././B/.]
>      Data for proc: [[51718,1],18]
>          Pid: 0    Local rank: 10    Node rank: 10    App rank: 18
>          State: INITIALIZED    App_context: 0
>          Locale: [B/B/B/B/B/B][./././././.]
>          Binding: [./././././B][./././././.]
>      Data for proc: [[51718,1],19]
>          Pid: 0    Local rank: 11    Node rank: 11    App rank: 19
>          State: INITIALIZED    App_context: 0
>          Locale: [./././././.][B/B/B/B/B/B]
>          Binding: [./././././.][./././././B]
>
>  Data for node: csclprd3-0-1         Launch id: -1    State: 0
>      Daemon: [[51718,0],4]    Daemon launched: True
>      Num slots: 6    Slots in use: 6    Oversubscribed: FALSE
>      Num slots allocated: 6    Max slots: 0
>      Username on node: NULL
>      Num procs: 6    Next node_rank: 6
>      Data for proc: [[51718,1],20]
>          Pid: 0    Local rank: 0    Node rank: 0    App rank: 20
>          State: INITIALIZED    App_context: 0
>          Locale: UNKNOWN
>          Binding: [B/././././.]
>      Data for proc: [[51718,1],21]
>          Pid: 0    Local rank: 1    Node rank: 1    App rank: 21
>          State: INITIALIZED    App_context: 0
>          Locale: UNKNOWN
>          Binding: [./B/./././.]
>      Data for proc: [[51718,1],22]
>          Pid: 0    Local rank: 2    Node rank: 2    App rank: 22
>          State: INITIALIZED    App_context: 0
>          Locale: UNKNOWN
>          Binding: [././B/././.]
>      Data for proc: [[51718,1],23]
>          Pid: 0    Local rank: 3    Node rank: 3    App rank: 23
>          State: INITIALIZED    App_context: 0
>          Locale: UNKNOWN
>          Binding: [./././B/./.]
>      Data for proc: [[51718,1],24]
>          Pid: 0    Local rank: 4    Node rank: 4    App rank: 24
>          State: INITIALIZED    App_context: 0
>          Locale: UNKNOWN
>          Binding: [././././B/.]
>      Data for proc: [[51718,1],25]
>          Pid: 0    Local rank: 5    Node rank: 5    App rank: 25
>          State: INITIALIZED    App_context: 0
>          Locale: UNKNOWN
>          Binding: [./././././B]
>
>  Data for node: csclprd3-0-2         Launch id: -1    State: 0
>      Daemon: [[51718,0],5]    Daemon launched: True
>      Num slots: 6    Slots in use: 6    Oversubscribed: FALSE
>      Num slots allocated: 6    Max slots: 0
>      Username on node: NULL
>      Num procs: 6    Next node_rank: 6
>      Data for proc: [[51718,1],26]
>          Pid: 0    Local rank: 0    Node rank: 0    App rank: 26
>          State: INITIALIZED    App_context: 0
>          Locale: UNKNOWN
>          Binding: [B/././././.]
>      Data for proc: [[51718,1],27]
>          Pid: 0    Local rank: 1    Node rank: 1    App rank: 27
>          State: INITIALIZED    App_context: 0
>          Locale: UNKNOWN
>          Binding: [./B/./././.]
>      Data for proc: [[51718,1],28]
>          Pid: 0    Local rank: 2    Node rank: 2    App rank: 28
>          State: INITIALIZED    App_context: 0
>          Locale: UNKNOWN
>          Binding: [././B/././.]
>      Data for proc: [[51718,1],29]
>          Pid: 0    Local rank: 3    Node rank: 3    App rank: 29
>          State: INITIALIZED    App_context: 0
>          Locale: UNKNOWN
>          Binding: [./././B/./.]
>      Data for proc: [[51718,1],30]
>          Pid: 0    Local rank: 4    Node rank: 4    App rank: 30
>          State: INITIALIZED    App_context: 0
>          Locale: UNKNOWN
>          Binding: [././././B/.]
>      Data for proc: [[51718,1],31]
>          Pid: 0    Local rank: 5    Node rank: 5    App rank: 31
>          State: INITIALIZED    App_context: 0
>          Locale: UNKNOWN
>          Binding: [./././././B]
>
>  Data for node: csclprd3-0-3         Launch id: -1    State: 0
>      Daemon: [[51718,0],6]    Daemon launched: True
>      Num slots: 6    Slots in use: 6    Oversubscribed: FALSE
>      Num slots allocated: 6    Max slots: 0
>      Username on node: NULL
>      Num procs: 6    Next node_rank: 6
>      Data for proc: [[51718,1],32]
>          Pid: 0    Local rank: 0    Node rank: 0    App rank: 32
>          State: INITIALIZED    App_context: 0
>          Locale: UNKNOWN
>          Binding: [B/././././.]
>      Data for proc: [[51718,1],33]
>          Pid: 0    Local rank: 1    Node rank: 1    App rank: 33
>          State: INITIALIZED    App_context: 0
>          Locale: UNKNOWN
>          Binding: [./B/./././.]
>      Data for proc: [[51718,1],34]
>          Pid: 0    Local rank: 2    Node rank: 2    App rank: 34
>          State: INITIALIZED    App_context: 0
>          Locale: UNKNOWN
>          Binding: [././B/././.]
>      Data for proc: [[51718,1],35]
>          Pid: 0    Local rank: 3    Node rank: 3    App rank: 35
>          State: INITIALIZED    App_context: 0
>          Locale: UNKNOWN
>          Binding: [./././B/./.]
>      Data for proc: [[51718,1],36]
>          Pid: 0    Local rank: 4    Node rank: 4    App rank: 36
>          State: INITIALIZED    App_context: 0
>          Locale: UNKNOWN
>          Binding: [././././B/.]
>      Data for proc: [[51718,1],37]
>          Pid: 0    Local rank: 5    Node rank: 5    App rank: 37
>          State: INITIALIZED    App_context: 0
>          Locale: UNKNOWN
>          Binding: [./././././B]
>
>  Data for node: csclprd3-0-4         Launch id: -1    State: 0
>      Daemon: [[51718,0],7]    Daemon launched: True
>      Num slots: 6    Slots in use: 6    Oversubscribed: FALSE
>      Num slots allocated: 6    Max slots: 0
>      Username on node: NULL
>      Num procs: 6    Next node_rank: 6
>      Data for proc: [[51718,1],38]
>          Pid: 0    Local rank: 0    Node rank: 0    App rank: 38
>          State: INITIALIZED    App_context: 0
>          Locale: UNKNOWN
>          Binding: [B/././././.]
>      Data for proc: [[51718,1],39]
>          Pid: 0    Local rank: 1    Node rank: 1    App rank: 39
>          State: INITIALIZED    App_context: 0
>          Locale: UNKNOWN
>          Binding: [./B/./././.]
>      Data for proc: [[51718,1],40]
>          Pid: 0    Local rank: 2    Node rank: 2    App rank: 40
>          State: INITIALIZED    App_context: 0
>          Locale: UNKNOWN
>          Binding: [././B/././.]
>      Data for proc: [[51718,1],41]
>          Pid: 0    Local rank: 3    Node rank: 3    App rank: 41
>          State: INITIALIZED    App_context: 0
>          Locale: UNKNOWN
>          Binding: [./././B/./.]
>      Data for proc: [[51718,1],42]
>          Pid: 0    Local rank: 4    Node rank: 4    App rank: 42
>          State: INITIALIZED    App_context: 0
>          Locale: UNKNOWN
>          Binding: [././././B/.]
>      Data for proc: [[51718,1],43]
>          Pid: 0    Local rank: 5    Node rank: 5    App rank: 43
>          State: INITIALIZED    App_context: 0
>          Locale: UNKNOWN
>          Binding: [./././././B]
>
>  Data for node: csclprd3-0-5         Launch id: -1    State: 0
>      Daemon: [[51718,0],8]    Daemon launched: True
>      Num slots: 6    Slots in use: 6    Oversubscribed: FALSE
>      Num slots allocated: 6    Max slots: 0
>      Username on node: NULL
>      Num procs: 6    Next node_rank: 6
>      Data for proc: [[51718,1],44]
>          Pid: 0    Local rank: 0    Node rank: 0    App rank: 44
>          State: INITIALIZED    App_context: 0
>          Locale: UNKNOWN
>          Binding: [B/././././.]
>      Data for proc: [[51718,1],45]
>          Pid: 0    Local rank: 1    Node rank: 1    App rank: 45
>          State: INITIALIZED    App_context: 0
>          Locale: UNKNOWN
>          Binding: [./B/./././.]
>      Data for proc: [[51718,1],46]
>          Pid: 0    Local rank: 2    Node rank: 2    App rank: 46
>          State: INITIALIZED    App_context: 0
>          Locale: UNKNOWN
>          Binding: [././B/././.]
>      Data for proc: [[51718,1],47]
>          Pid: 0    Local rank: 3    Node rank: 3    App rank: 47
>          State: INITIALIZED    App_context: 0
>          Locale: UNKNOWN
>          Binding: [./././B/./.]
>      Data for proc: [[51718,1],48]
>          Pid: 0    Local rank: 4    Node rank: 4    App rank: 48
>          State: INITIALIZED    App_context: 0
>          Locale: UNKNOWN
>          Binding: [././././B/.]
>      Data for proc: [[51718,1],49]
>          Pid: 0    Local rank: 5    Node rank: 5    App rank: 49
>          State: INITIALIZED    App_context: 0
>          Locale: UNKNOWN
>          Binding: [./././././B]
>
>  Data for node: csclprd3-0-6         Launch id: -1    State: 0
>      Daemon: [[51718,0],9]    Daemon launched: True
>      Num slots: 6    Slots in use: 6    Oversubscribed: FALSE
>      Num slots allocated: 6    Max slots: 0
>      Username on node: NULL
>      Num procs: 6    Next node_rank: 6
>      Data for proc: [[51718,1],50]
>          Pid: 0    Local rank: 0    Node rank: 0    App rank: 50
>          State: INITIALIZED    App_context: 0
>          Locale: UNKNOWN
>          Binding: [B/././././.]
>      Data for proc: [[51718,1],51]
>          Pid: 0    Local rank: 1    Node rank: 1    App rank: 51
>          State: INITIALIZED    App_context: 0
>          Locale: UNKNOWN
>          Binding: [./B/./././.]
>      Data for proc: [[51718,1],52]
>          Pid: 0    Local rank: 2    Node rank: 2    App rank: 52
>          State: INITIALIZED    App_context: 0
>          Locale: UNKNOWN
>          Binding: [././B/././.]
>      Data for proc: [[51718,1],53]
>          Pid: 0    Local rank: 3    Node rank: 3    App rank: 53
>          State: INITIALIZED    App_context: 0
>          Locale: UNKNOWN
>          Binding: [./././B/./.]
>      Data for proc: [[51718,1],54]
>          Pid: 0    Local rank: 4    Node rank: 4    App rank: 54
>          State: INITIALIZED    App_context: 0
>          Locale: UNKNOWN
>          Binding: [././././B/.]
>      Data for proc: [[51718,1],55]
>          Pid: 0    Local rank: 5    Node rank: 5    App rank: 55
>          State: INITIALIZED    App_context: 0
>          Locale: UNKNOWN
>          Binding: [./././././B]
>
>  Data for node: csclprd3-0-7         Launch id: -1    State: 0
>      Daemon: [[51718,0],10]    Daemon launched: True
>      Num slots: 16    Slots in use: 16    Oversubscribed: FALSE
>      Num slots allocated: 16    Max slots: 0
>      Username on node: NULL
>      Num procs: 16    Next node_rank: 16
>      Data for proc: [[51718,1],56]
>          Pid: 0    Local rank: 0    Node rank: 0    App rank: 56
>          State: INITIALIZED    App_context: 0
>          Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..]
>          Binding: [BB/../../../../../../..][../../../../../../../..]
>      Data for proc: [[51718,1],57]
>          Pid: 0    Local rank: 1    Node rank: 1    App rank: 57
>          State: INITIALIZED    App_context: 0
>          Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB]
>          Binding: [../../../../../../../..][BB/../../../../../../..]
>      Data for proc: [[51718,1],58]
>          Pid: 0    Local rank: 2    Node rank: 2    App rank: 58
>          State: INITIALIZED    App_context: 0
>          Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..]
>          Binding: [../BB/../../../../../..][../../../../../../../..]
>      Data for proc: [[51718,1],59]
>          Pid: 0    Local rank: 3    Node rank: 3    App rank: 59
>          State: INITIALIZED    App_context: 0
>          Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB]
>          Binding: [../../../../../../../..][../BB/../../../../../..]
>      Data for proc: [[51718,1],60]
>          Pid: 0    Local rank: 4    Node rank: 4    App rank: 60
>          State: INITIALIZED    App_context: 0
>          Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..]
>          Binding: [../../BB/../../../../..][../../../../../../../..]
>      Data for proc: [[51718,1],61]
>          Pid: 0    Local rank: 5    Node rank: 5    App rank: 61
>          State: INITIALIZED    App_context: 0
>          Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB]
>          Binding: [../../../../../../../..][../../BB/../../../../..]
>      Data for proc: [[51718,1],62]
>          Pid: 0    Local rank: 6    Node rank: 6    App rank: 62
>          State: INITIALIZED    App_context: 0
>          Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..]
>          Binding: [../../../BB/../../../..][../../../../../../../..]
>      Data for proc: [[51718,1],63]
>          Pid: 0    Local rank: 7    Node rank: 7    App rank: 63
>          State: INITIALIZED    App_context: 0
>          Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB]
>          Binding: [../../../../../../../..][../../../BB/../../../..]
>      Data for proc: [[51718,1],64]
>          Pid: 0    Local rank: 8    Node rank: 8    App rank: 64
>          State: INITIALIZED    App_context: 0
>          Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..]
>          Binding: [../../../../BB/../../..][../../../../../../../..]
>      Data for proc: [[51718,1],65]
>          Pid: 0    Local rank: 9    Node rank: 9    App rank: 65
>          State: INITIALIZED    App_context: 0
>          Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB]
>          Binding: [../../../../../../../..][../../../../BB/../../..]
>      Data for proc: [[51718,1],66]
>          Pid: 0    Local rank: 10    Node rank: 10    App rank: 66
>          State: INITIALIZED    App_context: 0
>          Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..]
>          Binding: [../../../../../BB/../..][../../../../../../../..]
>      Data for proc: [[51718,1],67]
>          Pid: 0    Local rank: 11    Node rank: 11    App rank: 67
>          State: INITIALIZED    App_context: 0
>          Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB]
>          Binding: [../../../../../../../..][../../../../../BB/../..]
>      Data for proc: [[51718,1],68]
>          Pid: 0    Local rank: 12    Node rank: 12    App rank: 68
>          State: INITIALIZED    App_context: 0
>          Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..]
>          Binding: [../../../../../../BB/..][../../../../../../../..]
>      Data for proc: [[51718,1],69]
>          Pid: 0    Local rank: 13    Node rank: 13    App rank: 69
>          State: INITIALIZED    App_context: 0
>          Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB]
>          Binding: [../../../../../../../..][../../../../../../BB/..]
>      Data for proc: [[51718,1],70]
>          Pid: 0    Local rank: 14    Node rank: 14    App rank: 70
>          State: INITIALIZED    App_context: 0
>          Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..]
>          Binding: [../../../../../../../BB][../../../../../../../..]
>      Data for proc: [[51718,1],71]
>          Pid: 0    Local rank: 15    Node rank: 15    App rank: 71
>          State: INITIALIZED    App_context: 0
>          Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB]
>          Binding: [../../../../../../../..][../../../../../../../BB]
>
>  Data for node: csclprd3-0-8         Launch id: -1    State: 0
>      Daemon: [[51718,0],11]    Daemon launched: True
>      Num slots: 16    Slots in use: 16    Oversubscribed: FALSE
>      Num slots allocated: 16    Max slots: 0
>      Username on node: NULL
>      Num procs: 16    Next node_rank: 16
>      Data for proc: [[51718,1],72]
>          Pid: 0    Local rank: 0    Node rank: 0    App rank: 72
>          State: INITIALIZED    App_context: 0
>          Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..]
>          Binding: [BB/../../../../../../..][../../../../../../../..]
>      Data for proc: [[51718,1],73]
>          Pid: 0    Local rank: 1    Node rank: 1    App rank: 73
>          State: INITIALIZED    App_context: 0
>          Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB]
>          Binding: [../../../../../../../..][BB/../../../../../../..]
>      Data for proc: [[51718,1],74]
>          Pid: 0    Local rank: 2    Node rank: 2    App rank: 74
>          State: INITIALIZED    App_context: 0
>          Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..]
>          Binding: [../BB/../../../../../..][../../../../../../../..]
>      Data for proc: [[51718,1],75]
>          Pid: 0    Local rank: 3    Node rank: 3    App rank: 75
>          State: INITIALIZED    App_context: 0
>          Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB]
>          Binding: [../../../../../../../..][../BB/../../../../../..]
>      Data for proc: [[51718,1],76]
>          Pid: 0    Local rank: 4    Node rank: 4    App rank: 76
>          State: INITIALIZED    App_context: 0
>          Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..]
>          Binding: [../../BB/../../../../..][../../../../../../../..]
>      Data for proc: [[51718,1],77]
>          Pid: 0    Local rank: 5    Node rank: 5    App rank: 77
>          State: INITIALIZED    App_context: 0
>          Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB]
>          Binding: [../../../../../../../..][../../BB/../../../../..]
>      Data for proc: [[51718,1],78]
>          Pid: 0    Local rank: 6    Node rank: 6    App rank: 78
>          State: INITIALIZED    App_context: 0
>          Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..]
>          Binding: [../../../BB/../../../..][../../../../../../../..]
>      Data for proc: [[51718,1],79]
>          Pid: 0    Local rank: 7    Node rank: 7    App rank: 79
>          State: INITIALIZED    App_context: 0
>          Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB]
>          Binding: [../../../../../../../..][../../../BB/../../../..]
>      Data for proc: [[51718,1],80]
>          Pid: 0    Local rank: 8    Node rank: 8    App rank: 80
>          State: INITIALIZED    App_context: 0
>          Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..]
>          Binding: [../../../../BB/../../..][../../../../../../../..]
>      Data for proc: [[51718,1],81]
>          Pid: 0    Local rank: 9    Node rank: 9    App rank: 81
>          State: INITIALIZED    App_context: 0
>          Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB]
>          Binding: [../../../../../../../..][../../../../BB/../../..]
>      Data for proc: [[51718,1],82]
>          Pid: 0    Local rank: 10    Node rank: 10    App rank: 82
>          State: INITIALIZED    App_context: 0
>          Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..]
>          Binding: [../../../../../BB/../..][../../../../../../../..]
>      Data for proc: [[51718,1],83]
>          Pid: 0    Local rank: 11    Node rank: 11    App rank: 83
>          State: INITIALIZED    App_context: 0
>          Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB]
>          Binding: [../../../../../../../..][../../../../../BB/../..]
>      Data for proc: [[51718,1],84]
>          Pid: 0    Local rank: 12    Node rank: 12    App rank: 84
>          State: INITIALIZED    App_context: 0
>          Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..]
>          Binding: [../../../../../../BB/..][../../../../../../../..]
>      Data for proc: [[51718,1],85]
>          Pid: 0    Local rank: 13    Node rank: 13    App rank: 85
>          State: INITIALIZED    App_context: 0
>          Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB]
>          Binding: [../../../../../../../..][../../../../../../BB/..]
>      Data for proc: [[51718,1],86]
>          Pid: 0    Local rank: 14    Node rank: 14    App rank: 86
>          State: INITIALIZED    App_context: 0
>          Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..]
>          Binding: [../../../../../../../BB][../../../../../../../..]
>      Data for proc: [[51718,1],87]
>          Pid: 0    Local rank: 15    Node rank: 15    App rank: 87
>          State: INITIALIZED    App_context: 0
>          Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB]
>          Binding: [../../../../../../../..][../../../../../../../BB]
>
>  Data for node: csclprd3-0-10         Launch id: -1    State: 0
>      Daemon: [[51718,0],12]    Daemon launched: True
>      Num slots: 16    Slots in use: 16    Oversubscribed: FALSE
>      Num slots allocated: 16    Max slots: 0
>      Username on node: NULL
>      Num procs: 16    Next node_rank: 16
>      Data for proc: [[51718,1],88]
>          Pid: 0    Local rank: 0    Node rank: 0    App rank: 88
>          State: INITIALIZED    App_context: 0
>          Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..]
>          Binding: [BB/../../../../../../..][../../../../../../../..]
>      Data for proc: [[51718,1],89]
>          Pid: 0    Local rank: 1    Node rank: 1    App rank: 89
>          State: INITIALIZED    App_context: 0
>          Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB]
>          Binding: [../../../../../../../..][BB/../../../../../../..]
>      Data for proc: [[51718,1],90]
>          Pid: 0    Local rank: 2    Node rank: 2    App rank: 90
>          State: INITIALIZED    App_context: 0
>          Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..]
>          Binding: [../BB/../../../../../..][../../../../../../../..]
>      Data for proc: [[51718,1],91]
>          Pid: 0    Local rank: 3    Node rank: 3    App rank: 91
>          State: INITIALIZED    App_context: 0
>          Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB]
>          Binding: [../../../../../../../..][../BB/../../../../../..]
>      Data for proc: [[51718,1],92]
>          Pid: 0    Local rank: 4    Node rank: 4    App rank: 92
>          State: INITIALIZED    App_context: 0
>          Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..]
>          Binding: [../../BB/../../../../..][../../../../../../../..]
>      Data for proc: [[51718,1],93]
>          Pid: 0    Local rank: 5    Node rank: 5    App rank: 93
>          State: INITIALIZED    App_context: 0
>          Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB]
>          Binding: [../../../../../../../..][../../BB/../../../../..]
>      Data for proc: [[51718,1],94]
>          Pid: 0    Local rank: 6    Node rank: 6    App rank: 94
>          State: INITIALIZED    App_context: 0
>          Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..]
>          Binding: [../../../BB/../../../..][../../../../../../../..]
>      Data for proc: [[51718,1],95]
>          Pid: 0    Local rank: 7    Node rank: 7    App rank: 95
>          State: INITIALIZED    App_context: 0
>          Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB]
>          Binding: [../../../../../../../..][../../../BB/../../../..]
>      Data for proc: [[51718,1],96]
>          Pid: 0    Local rank: 8    Node rank: 8    App rank: 96
>          State: INITIALIZED    App_context: 0
>          Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..]
>          Binding: [../../../../BB/../../..][../../../../../../../..]
>      Data for proc: [[51718,1],97]
>          Pid: 0    Local rank: 9    Node rank: 9    App rank: 97
>          State: INITIALIZED    App_context: 0
>          Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB]
>          Binding: [../../../../../../../..][../../../../BB/../../..]
>      Data for proc: [[51718,1],98]
>          Pid: 0    Local rank: 10    Node rank: 10    App rank: 98
>          State: INITIALIZED    App_context: 0
>          Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..]
>          Binding: [../../../../../BB/../..][../../../../../../../..]
>      Data for proc: [[51718,1],99]
>          Pid: 0    Local rank: 11    Node rank: 11    App rank: 99
>          State: INITIALIZED    App_context: 0
>          Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB]
>          Binding: [../../../../../../../..][../../../../../BB/../..]
>      Data for proc: [[51718,1],100]
>          Pid: 0    Local rank: 12    Node rank: 12    App rank: 100
>          State: INITIALIZED    App_context: 0
>          Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..]
>          Binding: [../../../../../../BB/..][../../../../../../../..]
>      Data for proc: [[51718,1],101]
>          Pid: 0    Local rank: 13    Node rank: 13    App rank: 101
>          State: INITIALIZED    App_context: 0
>          Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB]
>          Binding: [../../../../../../../..][../../../../../../BB/..]
>      Data for proc: [[51718,1],102]
>          Pid: 0    Local rank: 14    Node rank: 14    App rank: 102
>          State: INITIALIZED    App_context: 0
>          Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..]
>          Binding: [../../../../../../../BB][../../../../../../../..]
>      Data for proc: [[51718,1],103]
>          Pid: 0    Local rank: 15    Node rank: 15    App rank: 103
>          State: INITIALIZED    App_context: 0
>          Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB]
>          Binding: [../../../../../../../..][../../../../../../../BB]
>
>  Data for node: csclprd3-0-11         Launch id: -1    State: 0
>      Daemon: [[51718,0],13]    Daemon launched: True
>      Num slots: 16    Slots in use: 16    Oversubscribed: FALSE
>      Num slots allocated: 16    Max slots: 0
>      Username on node: NULL
>      Num procs: 16    Next node_rank: 16
>      Data for proc: [[51718,1],104]
>          Pid: 0    Local rank: 0    Node rank: 0    App rank: 104
>          State: INITIALIZED    App_context: 0
>          Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..]
>          Binding: [BB/../../../../../../..][../../../../../../../..]
>      Data for proc: [[51718,1],105]
>          Pid: 0    Local rank: 1    Node rank: 1    App rank: 105
>          State: INITIALIZED    App_context: 0
>          Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB]
>          Binding: [../../../../../../../..][BB/../../../../../../..]
>      Data for proc: [[51718,1],106]
>          Pid: 0    Local rank: 2    Node rank: 2    App rank: 106
>          State: INITIALIZED    App_context: 0
>          Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..]
>          Binding: [../BB/../../../../../..][../../../../../../../..]
>      Data for proc: [[51718,1],107]
>          Pid: 0    Local rank: 3    Node rank: 3    App rank: 107
>          State: INITIALIZED    App_context: 0
>          Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB]
>          Binding: [../../../../../../../..][../BB/../../../../../..]
>      Data for proc: [[51718,1],108]
>          Pid: 0    Local rank: 4    Node rank: 4    App rank: 108
>          State: INITIALIZED    App_context: 0
>          Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..]
>          Binding: [../../BB/../../../../..][../../../../../../../..]
>      Data for proc: [[51718,1],109]
>          Pid: 0    Local rank: 5    Node rank: 5    App rank: 109
>          State: INITIALIZED    App_context: 0
>          Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB]
>          Binding: [../../../../../../../..][../../BB/../../../../..]
>      Data for proc: [[51718,1],110]
>          Pid: 0    Local rank: 6    Node rank: 6    App rank: 110
>          State: INITIALIZED    App_context: 0
>          Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..]
>          Binding: [../../../BB/../../../..][../../../../../../../..]
>      Data for proc: [[51718,1],111]
>          Pid: 0    Local rank: 7    Node rank: 7    App rank: 111
>          State: INITIALIZED    App_context: 0
>          Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB]
>          Binding: [../../../../../../../..][../../../BB/../../../..]
>      Data for proc: [[51718,1],112]
>          Pid: 0    Local rank: 8    Node rank: 8    App rank: 112
>          State: INITIALIZED    App_context: 0
>          Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..]
>          Binding: [../../../../BB/../../..][../../../../../../../..]
>      Data for proc: [[51718,1],113]
>          Pid: 0    Local rank: 9    Node rank: 9    App rank: 113
>          State: INITIALIZED    App_context: 0
>          Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB]
>          Binding: [../../../../../../../..][../../../../BB/../../..]
>      Data for proc: [[51718,1],114]
>          Pid: 0    Local rank: 10    Node rank: 10    App rank: 114
>          State: INITIALIZED    App_context: 0
>          Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..]
>          Binding: [../../../../../BB/../..][../../../../../../../..]
>      Data for proc: [[51718,1],115]
>          Pid: 0    Local rank: 11    Node rank: 11    App rank: 115
>          State: INITIALIZED    App_context: 0
>          Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB]
>          Binding: [../../../../../../../..][../../../../../BB/../..]
>      Data for proc: [[51718,1],116]
>          Pid: 0    Local rank: 12    Node rank: 12    App rank: 116
>          State: INITIALIZED    App_context: 0
>          Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..]
>          Binding: [../../../../../../BB/..][../../../../../../../..]
>      Data for proc: [[51718,1],117]
>          Pid: 0    Local rank: 13    Node rank: 13    App rank: 117
>          State: INITIALIZED    App_context: 0
>          Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB]
>          Binding: [../../../../../../../..][../../../../../../BB/..]
>      Data for proc: [[51718,1],118]
>          Pid: 0    Local rank: 14    Node rank: 14    App rank: 118
>          State: INITIALIZED    App_context: 0
>          Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..]
>          Binding: [../../../../../../../BB][../../../../../../../..]
>      Data for proc: [[51718,1],119]
>          Pid: 0    Local rank: 15    Node rank: 15    App rank: 119
>          State: INITIALIZED    App_context: 0
>          Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB]
>          Binding: [../../../../../../../..][../../../../../../../BB]
>
>  Data for node: csclprd3-0-12         Launch id: -1    State: 0
>      Daemon: [[51718,0],14]    Daemon launched: True
>      Num slots: 6    Slots in use: 6    Oversubscribed: FALSE
>      Num slots allocated: 6    Max slots: 0
>      Username on node: NULL
>      Num procs: 6    Next node_rank: 6
>      Data for proc: [[51718,1],120]
>          Pid: 0    Local rank: 0    Node rank: 0    App rank: 120
>          State: INITIALIZED    App_context: 0
>          Locale: UNKNOWN
>          Binding: [BB/../../../../..]
>      Data for proc: [[51718,1],121]
>          Pid: 0    Local rank: 1    Node rank: 1    App rank: 121
>          State: INITIALIZED    App_context: 0
>          Locale: UNKNOWN
>          Binding: [../BB/../../../..]
>      Data for proc: [[51718,1],122]
>          Pid: 0    Local rank: 2    Node rank: 2    App rank: 122
>          State: INITIALIZED    App_context: 0
>          Locale: UNKNOWN
>          Binding: [../../BB/../../..]
>      Data for proc: [[51718,1],123]
>          Pid: 0    Local rank: 3    Node rank: 3    App rank: 123
>          State: INITIALIZED    App_context: 0
>          Locale: UNKNOWN
>          Binding: [../../../BB/../..]
>      Data for proc: [[51718,1],124]
>          Pid: 0    Local rank: 4    Node rank: 4    App rank: 124
>          State: INITIALIZED    App_context: 0
>          Locale: UNKNOWN
>          Binding: [../../../../BB/..]
>      Data for proc: [[51718,1],125]
>          Pid: 0    Local rank: 5    Node rank: 5    App rank: 125
>          State: INITIALIZED    App_context: 0
>          Locale: UNKNOWN
>          Binding: [../../../../../BB]
>
>  Data for node: csclprd3-0-13         Launch id: -1    State: 0
>      Daemon: [[51718,0],15]    Daemon launched: True
>      Num slots: 12    Slots in use: 6    Oversubscribed: FALSE
>      Num slots allocated: 12    Max slots: 0
>      Username on node: NULL
>      Num procs: 6    Next node_rank: 6
>      Data for proc: [[51718,1],126]
>          Pid: 0    Local rank: 0    Node rank: 0    App rank: 126
>          State: INITIALIZED    App_context: 0
>          Locale: [BB/BB/BB/BB/BB/BB][../../../../../..]
>          Binding: [BB/../../../../..][../../../../../..]
>      Data for proc: [[51718,1],127]
>          Pid: 0    Local rank: 1    Node rank: 1    App rank: 127
>          State: INITIALIZED    App_context: 0
>          Locale: [../../../../../..][BB/BB/BB/BB/BB/BB]
>          Binding: [../../../../../..][BB/../../../../..]
>      Data for proc: [[51718,1],128]
>          Pid: 0    Local rank: 2    Node rank: 2    App rank: 128
>          State: INITIALIZED    App_context: 0
>          Locale: [BB/BB/BB/BB/BB/BB][../../../../../..]
>          Binding: [../BB/../../../..][../../../../../..]
>      Data for proc: [[51718,1],129]
>          Pid: 0    Local rank: 3    Node rank: 3    App rank: 129
>          State: INITIALIZED    App_context: 0
>          Locale: [../../../../../..][BB/BB/BB/BB/BB/BB]
>          Binding: [../../../../../..][../BB/../../../..]
>      Data for proc: [[51718,1],130]
>          Pid: 0    Local rank: 4    Node rank: 4    App rank: 130
>          State: INITIALIZED    App_context: 0
>          Locale: [BB/BB/BB/BB/BB/BB][../../../../../..]
>          Binding: [../../BB/../../..][../../../../../..]
>      Data for proc: [[51718,1],131]
>          Pid: 0    Local rank: 5    Node rank: 5    App rank: 131
>          State: INITIALIZED    App_context: 0
>          Locale: [../../../../../..][BB/BB/BB/BB/BB/BB]
>          Binding: [../../../../../..][../../BB/../../..]
> [csclprd3-0-13:31619] *** Process received signal ***
> [csclprd3-0-13:31619] Signal: Bus error (7)
> [csclprd3-0-13:31619] Signal code: Non-existant physical address (2)
> [csclprd3-0-13:31619] Failing at address: 0x7f1374267a00
> [csclprd3-0-13:31620] *** Process received signal ***
> [csclprd3-0-13:31620] Signal: Bus error (7)
> [csclprd3-0-13:31620] Signal code: Non-existant physical address (2)
> [csclprd3-0-13:31620] Failing at address: 0x7fcc702a7980
> [csclprd3-0-13:31615] *** Process received signal ***
> [csclprd3-0-13:31615] Signal: Bus error (7)
> [csclprd3-0-13:31615] Signal code: Non-existant physical address (2)
> [csclprd3-0-13:31615] Failing at address: 0x7f8128367880
> [csclprd3-0-13:31616] *** Process received signal ***
> [csclprd3-0-13:31616] Signal: Bus error (7)
> [csclprd3-0-13:31616] Signal code: Non-existant physical address (2)
> [csclprd3-0-13:31616] Failing at address: 0x7fe674227a00
> [csclprd3-0-13:31617] *** Process received signal ***
> [csclprd3-0-13:31617] Signal: Bus error (7)
> [csclprd3-0-13:31617] Signal code: Non-existant physical address (2)
> [csclprd3-0-13:31617] Failing at address: 0x7f061c32db80
> [csclprd3-0-13:31618] *** Process received signal ***
> [csclprd3-0-13:31618] Signal: Bus error (7)
> [csclprd3-0-13:31618] Signal code: Non-existant physical address (2)
> [csclprd3-0-13:31618] Failing at address: 0x7fb8402eaa80
> [csclprd3-0-13:31618] [ 0] /lib64/libpthread.so.0(+0xf500)[0x7fb851851500]
> [csclprd3-0-13:31618] [ 1] [csclprd3-0-13:31616] [ 0]
> /lib64/libpthread.so.0(+0xf500)[0x7fe6843a4500]
> [csclprd3-0-13:31616] [ 1] [csclprd3-0-13:31620] [ 0]
> /lib64/libpthread.so.0(+0xf500)[0x7fcc80c54500]
> [csclprd3-0-13:31620] [ 1]
> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x167f61)[0x7fcc80fc9f61]
> [csclprd3-0-13:31620] [ 2]
> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x168047)[0x7fcc80fca047]
> [csclprd3-0-13:31620] [ 3] [csclprd3-0-13:31615] [ 0]
> /lib64/libpthread.so.0(+0xf500)[0x7f81385ca500]
> [csclprd3-0-13:31615] [ 1]
> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x167f61)[0x7f813893ff61]
> [csclprd3-0-13:31615] [ 2]
> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x168047)[0x7f8138940047]
> [csclprd3-0-13:31615] [ 3]
> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x167f61)[0x7fb851bc6f61]
> [csclprd3-0-13:31618] [ 2]
> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x168047)[0x7fb851bc7047]
> [csclprd3-0-13:31618] [ 3]
> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x55670)[0x7fb851ab4670]
> [csclprd3-0-13:31618] [ 4] [csclprd3-0-13:31617] [ 0]
> /lib64/libpthread.so.0(+0xf500)[0x7f062cfe5500]
> [csclprd3-0-13:31617] [ 1]
> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x167f61)[0x7f062d35af61]
> [csclprd3-0-13:31617] [ 2]
> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x168047)[0x7f062d35b047]
> [csclprd3-0-13:31617] [ 3]
> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x55670)[0x7f062d248670]
> [csclprd3-0-13:31617] [ 4] [csclprd3-0-13:31619] [ 0]
> /lib64/libpthread.so.0(+0xf500)[0x7f1384fde500]
> [csclprd3-0-13:31619] [ 1]
> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x167f61)[0x7f1385353f61]
> [csclprd3-0-13:31619] [ 2]
> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x167f61)[0x7fe684719f61]
> [csclprd3-0-13:31616] [ 2]
> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x168047)[0x7fe68471a047]
> [csclprd3-0-13:31616] [ 3]
> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x55670)[0x7fe684607670]
> [csclprd3-0-13:31616] [ 4]
> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x168047)[0x7f1385354047]
> [csclprd3-0-13:31619] [ 3]
> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x55670)[0x7f1385241670]
> [csclprd3-0-13:31619] [ 4]
> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(ompi_free_list_grow+0x3b9)[0x7f13852425ab]
> [csclprd3-0-13:31619] [ 5]
> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(ompi_free_list_resize_mt+0xfb)[0x7f1385242751]
> [csclprd3-0-13:31619] [ 6]
> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(mca_btl_sm_add_procs+0x671)[0x7f13853501c9]
> [csclprd3-0-13:31619] [ 7]
> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x14a628)[0x7f1385336628]
> [csclprd3-0-13:31619] [ 8]
> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x55670)[0x7fcc80eb7670]
> [csclprd3-0-13:31620] [ 4]
> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(ompi_free_list_grow+0x3b9)[0x7fcc80eb85ab]
> [csclprd3-0-13:31620] [ 5]
> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(ompi_free_list_resize_mt+0xfb)[0x7fcc80eb8751]
> [csclprd3-0-13:31620] [ 6]
> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(mca_btl_sm_add_procs+0x671)[0x7fcc80fc61c9]
> [csclprd3-0-13:31620] [ 7]
> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x14a628)[0x7fcc80fac628]
> [csclprd3-0-13:31620] [ 8]
> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(mca_pml_ob1_add_procs+0xff)[0x7fcc8111fd61]
> [csclprd3-0-13:31620] [ 9]
> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x55670)[0x7f813882d670]
> [csclprd3-0-13:31615] [ 4]
> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(ompi_free_list_grow+0x3b9)[0x7f813882e5ab]
> [csclprd3-0-13:31615] [ 5]
> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(ompi_free_list_resize_mt+0xfb)[0x7f813882e751]
> [csclprd3-0-13:31615] [ 6]
> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(mca_btl_sm_add_procs+0x671)[0x7f813893c1c9]
> [csclprd3-0-13:31615] [ 7]
> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x14a628)[0x7f8138922628]
> [csclprd3-0-13:31615] [ 8]
> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(mca_pml_ob1_add_procs+0xff)[0x7f8138a95d61]
> [csclprd3-0-13:31615] [ 9]
> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(ompi_mpi_init+0xbda)[0x7f813885d747]
> [csclprd3-0-13:31615] [10]
> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(ompi_free_list_grow+0x3b9)[0x7fb851ab55ab]
> [csclprd3-0-13:31618] [ 5]
> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(ompi_free_list_resize_mt+0xfb)[0x7fb851ab5751]
> [csclprd3-0-13:31618] [ 6]
> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(mca_btl_sm_add_procs+0x671)[0x7fb851bc31c9]
> [csclprd3-0-13:31618] [ 7]
> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x14a628)[0x7fb851ba9628]
> [csclprd3-0-13:31618] [ 8]
> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(mca_pml_ob1_add_procs+0xff)[0x7fb851d1cd61]
> [csclprd3-0-13:31618] [ 9]
> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(ompi_mpi_init+0xbda)[0x7fb851ae4747]
> [csclprd3-0-13:31618] [10]
> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(ompi_free_list_grow+0x3b9)[0x7f062d2495ab]
> [csclprd3-0-13:31617] [ 5]
> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(ompi_free_list_resize_mt+0xfb)[0x7f062d249751]
> [csclprd3-0-13:31617] [ 6]
> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(mca_btl_sm_add_procs+0x671)[0x7f062d3571c9]
> [csclprd3-0-13:31617] [ 7]
> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x14a628)[0x7f062d33d628]
> [csclprd3-0-13:31617] [ 8]
> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(mca_pml_ob1_add_procs+0xff)[0x7f062d4b0d61]
> [csclprd3-0-13:31617] [ 9]
> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(ompi_mpi_init+0xbda)[0x7f062d278747]
> [csclprd3-0-13:31617] [10]
> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(ompi_free_list_grow+0x3b9)[0x7fe6846085ab]
> [csclprd3-0-13:31616] [ 5]
> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(ompi_free_list_resize_mt+0xfb)[0x7fe684608751]
> [csclprd3-0-13:31616] [ 6]
> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(mca_btl_sm_add_procs+0x671)[0x7fe6847161c9]
> [csclprd3-0-13:31616] [ 7]
> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x14a628)[0x7fe6846fc628]
> [csclprd3-0-13:31616] [ 8]
> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(mca_pml_ob1_add_procs+0xff)[0x7fe68486fd61]
> [csclprd3-0-13:31616] [ 9]
> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(ompi_mpi_init+0xbda)[0x7fe684637747]
> [csclprd3-0-13:31616] [10]
> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(MPI_Init+0x185)[0x7fe68467750b]
> [csclprd3-0-13:31616] [11]
> /hpc/home/lanew/mpi/openmpi/ProcessColors3[0x400ad0]
> [csclprd3-0-13:31616] [12]
> /lib64/libc.so.6(__libc_start_main+0xfd)[0x7fe684021cdd]
> [csclprd3-0-13:31616] [13]
> /hpc/home/lanew/mpi/openmpi/ProcessColors3[0x400999]
> [csclprd3-0-13:31616] *** End of error message ***
> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(MPI_Init+0x185)[0x7f062d2b850b]
> [csclprd3-0-13:31617] [11]
> /hpc/home/lanew/mpi/openmpi/ProcessColors3[0x400ad0]
> [csclprd3-0-13:31617] [12]
> /lib64/libc.so.6(__libc_start_main+0xfd)[0x7f062cc62cdd]
> [csclprd3-0-13:31617] [13]
> /hpc/home/lanew/mpi/openmpi/ProcessColors3[0x400999]
> [csclprd3-0-13:31617] *** End of error message ***
>
> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(mca_pml_ob1_add_procs+0xff)[0x7f13854a9d61]
> [csclprd3-0-13:31619] [ 9]
> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(ompi_mpi_init+0xbda)[0x7f1385271747]
> [csclprd3-0-13:31619] [10]
> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(MPI_Init+0x185)[0x7f13852b150b]
> [csclprd3-0-13:31619] [11]
> /hpc/home/lanew/mpi/openmpi/ProcessColors3[0x400ad0]
> [csclprd3-0-13:31619] [12]
> /lib64/libc.so.6(__libc_start_main+0xfd)[0x7f1384c5bcdd]
> [csclprd3-0-13:31619] [13]
> /hpc/home/lanew/mpi/openmpi/ProcessColors3[0x400999]
> [csclprd3-0-13:31619] *** End of error message ***
>
> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(ompi_mpi_init+0xbda)[0x7fcc80ee7747]
> [csclprd3-0-13:31620] [10]
> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(MPI_Init+0x185)[0x7fcc80f2750b]
> [csclprd3-0-13:31620] [11]
> /hpc/home/lanew/mpi/openmpi/ProcessColors3[0x400ad0]
> [csclprd3-0-13:31620] [12]
> /lib64/libc.so.6(__libc_start_main+0xfd)[0x7fcc808d1cdd]
> [csclprd3-0-13:31620] [13]
> /hpc/home/lanew/mpi/openmpi/ProcessColors3[0x400999]
> [csclprd3-0-13:31620] *** End of error message ***
> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(MPI_Init+0x185)[0x7f813889d50b]
> [csclprd3-0-13:31615] [11]
> /hpc/home/lanew/mpi/openmpi/ProcessColors3[0x400ad0]
> [csclprd3-0-13:31615] [12]
> /lib64/libc.so.6(__libc_start_main+0xfd)[0x7f8138247cdd]
> [csclprd3-0-13:31615] [13]
> /hpc/home/lanew/mpi/openmpi/ProcessColors3[0x400999]
> [csclprd3-0-13:31615] *** End of error message ***
> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(MPI_Init+0x185)[0x7fb851b2450b]
> [csclprd3-0-13:31618] [11]
> /hpc/home/lanew/mpi/openmpi/ProcessColors3[0x400ad0]
> [csclprd3-0-13:31618] [12]
> /lib64/libc.so.6(__libc_start_main+0xfd)[0x7fb8514cecdd]
> [csclprd3-0-13:31618] [13]
> /hpc/home/lanew/mpi/openmpi/ProcessColors3[0x400999]
> [csclprd3-0-13:31618] *** End of error message ***
> --------------------------------------------------------------------------
> mpirun noticed that process rank 126 with PID 0 on node csclprd3-0-13
> exited on signal 7 (Bus error).
> --------------------------------------------------------------------------
>
>  ------------------------------
> *From:* users [users-boun...@open-mpi.org
> <javascript:_e(%7B%7D,'cvml','users-boun...@open-mpi.org');>] on behalf
> of Ralph Castain [r...@open-mpi.org
> <javascript:_e(%7B%7D,'cvml','r...@open-mpi.org');>]
> *Sent:* Tuesday, June 23, 2015 6:20 PM
> *To:* Open MPI Users
> *Subject:* Re: [OMPI users] OpenMPI 1.8.6, CentOS 6.3, too many slots =
> crash
>
>   Wow - that is one sick puppy! I see that some nodes are reporting
> not-bound for their procs, and the rest are binding to socket (as they
> should). Some of your nodes clearly do not have hyper threads enabled (or
> only have single-thread cores on them), and have 2 cores/socket. Other
> nodes have 8 cores/socket with hyper threads enabled, while still others
> have 6 cores/socket and HT enabled.
>
>  I don't see anyone binding to a single HT if multiple HTs/core are
> available. I think you are being fooled by those nodes that either don't
> have HT enabled, or have only 1 HT/core.
>
>  In both cases, it is node 13 that is the node that fails. I also note
> that you said everything works okay with < 132 ranks, and node 13 hosts
> ranks 127-131. So node 13 would host ranks even if you reduced the number
> in the job to 131. This would imply that it probably isn't something wrong
> with the node itself.
>
>  Is there any way you could run a job of this size on a homogeneous
> cluster? The procs all show bindings that look right, but I'm wondering if
> the heterogeneity is the source of the trouble here. We may be
> communicating the binding pattern incorrectly and giving bad info to the
> backend daemon.
>
>  Also, rather than --report-bindings, use "--display-devel-map" on the
> command line and let's see what the mapper thinks it did. If there is a
> problem with placement, that is where it would exist.
>
>
> On Tue, Jun 23, 2015 at 5:12 PM, Lane, William <william.l...@cshs.org
> <javascript:_e(%7B%7D,'cvml','william.l...@cshs.org');>> wrote:
>
>>  Ralph,
>>
>> There is something funny going on, the trace from the
>> runs w/the debug build aren't showing any differences from
>> what I got earlier. However, I did do a run w/the --bind-to core
>> switch and was surprised to see that hyperthreading cores were
>> sometimes being used.
>>
>> Here's the traces that I have:
>>
>> mpirun -np 132 -report-bindings --prefix /hpc/apps/mpi/openmpi/1.8.6/
>> --hostfile hostfile-noslots --mca btl_tcp_if_include eth0 --hetero-nodes
>> /hpc/home/lanew/mpi/openmpi/ProcessColors3
>> [csclprd3-0-5:16802] MCW rank 44 is not bound (or bound to all available
>> processors)
>> [csclprd3-0-5:16802] MCW rank 45 is not bound (or bound to all available
>> processors)
>> [csclprd3-0-5:16802] MCW rank 46 is not bound (or bound to all available
>> processors)
>> [csclprd3-6-5:12480] MCW rank 4 bound to socket 0[core 0[hwt 0]], socket
>> 0[core 1[hwt 0]]: [B/B][./.]
>> [csclprd3-6-5:12480] MCW rank 5 bound to socket 1[core 2[hwt 0]], socket
>> 1[core 3[hwt 0]]: [./.][B/B]
>> [csclprd3-6-5:12480] MCW rank 6 bound to socket 0[core 0[hwt 0]], socket
>> 0[core 1[hwt 0]]: [B/B][./.]
>> [csclprd3-6-5:12480] MCW rank 7 bound to socket 1[core 2[hwt 0]], socket
>> 1[core 3[hwt 0]]: [./.][B/B]
>> [csclprd3-0-5:16802] MCW rank 47 is not bound (or bound to all available
>> processors)
>> [csclprd3-0-5:16802] MCW rank 48 is not bound (or bound to all available
>> processors)
>> [csclprd3-0-5:16802] MCW rank 49 is not bound (or bound to all available
>> processors)
>> [csclprd3-0-1:14318] MCW rank 22 is not bound (or bound to all available
>> processors)
>> [csclprd3-0-1:14318] MCW rank 23 is not bound (or bound to all available
>> processors)
>> [csclprd3-0-1:14318] MCW rank 24 is not bound (or bound to all available
>> processors)
>> [csclprd3-6-1:24682] MCW rank 3 bound to socket 1[core 2[hwt 0]], socket
>> 1[core 3[hwt 0]]: [./.][B/B]
>> [csclprd3-6-1:24682] MCW rank 0 bound to socket 0[core 0[hwt 0]], socket
>> 0[core 1[hwt 0]]: [B/B][./.]
>> [csclprd3-0-1:14318] MCW rank 25 is not bound (or bound to all available
>> processors)
>> [csclprd3-0-1:14318] MCW rank 20 is not bound (or bound to all available
>> processors)
>> [csclprd3-0-3:13827] MCW rank 34 is not bound (or bound to all available
>> processors)
>> [csclprd3-0-1:14318] MCW rank 21 is not bound (or bound to all available
>> processors)
>> [csclprd3-0-3:13827] MCW rank 35 is not bound (or bound to all available
>> processors)
>> [csclprd3-6-1:24682] MCW rank 1 bound to socket 1[core 2[hwt 0]], socket
>> 1[core 3[hwt 0]]: [./.][B/B]
>> [csclprd3-0-3:13827] MCW rank 36 is not bound (or bound to all available
>> processors)
>> [csclprd3-6-1:24682] MCW rank 2 bound to socket 0[core 0[hwt 0]], socket
>> 0[core 1[hwt 0]]: [B/B][./.]
>> [csclprd3-0-6:30371] MCW rank 51 is not bound (or bound to all available
>> processors)
>> [csclprd3-0-6:30371] MCW rank 52 is not bound (or bound to all available
>> processors)
>> [csclprd3-0-6:30371] MCW rank 53 is not bound (or bound to all available
>> processors)
>> [csclprd3-0-2:05825] MCW rank 30 is not bound (or bound to all available
>> processors)
>> [csclprd3-0-6:30371] MCW rank 54 is not bound (or bound to all available
>> processors)
>> [csclprd3-0-3:13827] MCW rank 37 is not bound (or bound to all available
>> processors)
>> [csclprd3-0-2:05825] MCW rank 31 is not bound (or bound to all available
>> processors)
>> [csclprd3-0-3:13827] MCW rank 32 is not bound (or bound to all available
>> processors)
>> [csclprd3-0-6:30371] MCW rank 55 is not bound (or bound to all available
>> processors)
>> [csclprd3-0-3:13827] MCW rank 33 is not bound (or bound to all available
>> processors)
>> [csclprd3-0-6:30371] MCW rank 50 is not bound (or bound to all available
>> processors)
>> [csclprd3-0-2:05825] MCW rank 26 is not bound (or bound to all available
>> processors)
>> [csclprd3-0-2:05825] MCW rank 27 is not bound (or bound to all available
>> processors)
>> [csclprd3-0-2:05825] MCW rank 28 is not bound (or bound to all available
>> processors)
>> [csclprd3-0-2:05825] MCW rank 29 is not bound (or bound to all available
>> processors)
>> [csclprd3-0-12:12383] MCW rank 121 is not bound (or bound to all
>> available processors)
>> [csclprd3-0-12:12383] MCW rank 122 is not bound (or bound to all
>> available processors)
>> [csclprd3-0-12:12383] MCW rank 123 is not bound (or bound to all
>> available processors)
>> [csclprd3-0-12:12383] MCW rank 124 is not bound (or bound to all
>> available processors)
>> [csclprd3-0-12:12383] MCW rank 125 is not bound (or bound to all
>> available processors)
>> [csclprd3-0-12:12383] MCW rank 120 is not bound (or bound to all
>> available processors)
>> [csclprd3-0-0:31079] MCW rank 13 bound to socket 1[core 6[hwt 0]], socket
>> 1[core 7[hwt 0]], socket 1[core 8[hwt 0]], socket 1[core 9[hwt 0]], socket
>> 1[core 10[hwt 0]], socket 1[core 11[hwt 0]]: [./././././.][B/B/B/B/B/B]
>> [csclprd3-0-0:31079] MCW rank 14 bound to socket 0[core 0[hwt 0]], socket
>> 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], socket 0[core 3[hwt 0]], socket
>> 0[core 4[hwt 0]], socket 0[core 5[hwt 0]]: [B/B/B/B/B/B][./././././.]
>> [csclprd3-0-0:31079] MCW rank 15 bound to socket 1[core 6[hwt 0]], socket
>> 1[core 7[hwt 0]], socket 1[core 8[hwt 0]], socket 1[core 9[hwt 0]], socket
>> 1[core 10[hwt 0]], socket 1[core 11[hwt 0]]: [./././././.][B/B/B/B/B/B]
>> [csclprd3-0-0:31079] MCW rank 16 bound to socket 0[core 0[hwt 0]], socket
>> 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], socket 0[core 3[hwt 0]], socket
>> 0[core 4[hwt 0]], socket 0[core 5[hwt 0]]: [B/B/B/B/B/B][./././././.]
>> [csclprd3-0-7:20515] MCW rank 68 bound to socket 0[core 0[hwt 0-1]],
>> socket 0[core 1[hwt 0-1]], socket 0[core 2[hwt 0-1]], socket 0[core 3[hwt
>> 0-1]], socket 0[core 4[hwt 0-1]], socket 0[core 5[hwt 0-1]], socket 0[core
>> 6[hwt 0-1]], socket 0[core 7[hwt 0-1]]:
>> [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..]
>> [csclprd3-0-10:19096] MCW rank 100 bound to socket 0[core 0[hwt 0-1]],
>> socket 0[core 1[hwt 0-1]], socket 0[core 2[hwt 0-1]], socket 0[core 3[hwt
>> 0-1]], socket 0[core 4[hwt 0-1]], socket 0[core 5[hwt 0-1]], socket 0[core
>> 6[hwt 0-1]], socket 0[core 7[hwt 0-1]]:
>> [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..]
>> [csclprd3-0-7:20515] MCW rank 69 bound to socket 1[core 8[hwt 0-1]],
>> socket 1[core 9[hwt 0-1]], socket 1[core 10[hwt 0-1]], socket 1[core 11[hwt
>> 0-1]], socket 1[core 12[hwt 0-1]], socket 1[core 13[hwt 0-1]], socket
>> 1[core 14[hwt 0-1]], socket 1[core 15[hwt 0-1]]:
>> [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB]
>> [csclprd3-0-10:19096] MCW rank 101 bound to socket 1[core 8[hwt 0-1]],
>> socket 1[core 9[hwt 0-1]], socket 1[core 10[hwt 0-1]], socket 1[core 11[hwt
>> 0-1]], socket 1[core 12[hwt 0-1]], socket 1[core 13[hwt 0-1]], socket
>> 1[core 14[hwt 0-1]], socket 1[core 15[hwt 0-1]]:
>> [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB]
>> [csclprd3-0-0:31079] MCW rank 17 bound to socket 1[core 6[hwt 0]], socket
>> 1[core 7[hwt 0]], socket 1[core 8[hwt 0]], socket 1[core 9[hwt 0]], socket
>> 1[core 10[hwt 0]], socket 1[core 11[hwt 0]]: [./././././.][B/B/B/B/B/B]
>> [csclprd3-0-7:20515] MCW rank 70 bound to socket 0[core 0[hwt 0-1]],
>> socket 0[core 1[hwt 0-1]], socket 0[core 2[hwt 0-1]], socket 0[core 3[hwt
>> 0-1]], socket 0[core 4[hwt 0-1]], socket 0[core 5[hwt 0-1]], socket 0[core
>> 6[hwt 0-1]], socket 0[core 7[hwt 0-1]]:
>> [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..]
>> [csclprd3-0-10:19096] MCW rank 102 bound to socket 0[core 0[hwt 0-1]],
>> socket 0[core 1[hwt 0-1]], socket 0[core 2[hwt 0-1]], socket 0[core 3[hwt
>> 0-1]], socket 0[core 4[hwt 0-1]], socket 0[core 5[hwt 0-1]], socket 0[core
>> 6[hwt 0-1]], socket 0[core 7[hwt 0-1]]:
>> [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..]
>> [csclprd3-0-11:31636] MCW rank 116 bound to socket 0[core 0[hwt 0-1]],
>> socket 0[core 1[hwt 0-1]], socket 0[core 2[hwt 0-1]], socket 0[core 3[hwt
>> 0-1]], socket 0[core 4[hwt 0-1]], socket 0[core 5[hwt 0-1]], socket 0[core
>> 6[hwt 0-1]], socket 0[core 7[hwt 0-1]]:
>> [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..]
>> [csclprd3-0-11:31636] MCW rank 117 bound to socket 1[core 8[hwt 0-1]],
>> socket 1[core 9[hwt 0-1]], socket 1[core 10[hwt 0-1]], socket 1[core 11[hwt
>> 0-1]], socket 1[core 12[hwt 0-1]], socket 1[core 13[hwt 0-1]], socket
>> 1[core 14[hwt 0-1]], socket 1[core 15[hwt 0-1]]:
>> [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB]
>> [csclprd3-0-0:31079] MCW rank 18 bound to socket 0[core 0[hwt 0]], socket
>> 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], socket 0[core 3[hwt 0]], socket
>> 0[core 4[hwt 0]], socket 0[core 5[hwt 0]]: [B/B/B/B/B/B][./././././.]
>> [csclprd3-0-11:31636] MCW rank 118 bound to socket 0[core 0[hwt 0-1]],
>> socket 0[core 1[hwt 0-1]], socket 0[core 2[hwt 0-1]], socket 0[core 3[hwt
>> 0-1]], socket 0[core 4[hwt 0-1]], socket 0[core 5[hwt 0-1]], socket 0[core
>> 6[hwt 0-1]], socket 0[core 7[hwt 0-1]]:
>> [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..]
>> [csclprd3-0-0:31079] MCW rank 19 bound to socket 1[core 6[hwt 0]], socket
>> 1[core 7[hwt 0]], socket 1[core 8[hwt 0]], socket 1[core 9[hwt 0]], socket
>> 1[core 10[hwt 0]], socket 1[core 11[hwt 0]]: [./././././.][B/B/B/B/B/B]
>> [csclprd3-0-7:20515] MCW rank 71 bound to socket 1[core 8[hwt 0-1]],
>> socket 1[core 9[hwt 0-1]], socket 1[core 10[hwt 0-1]], socket 1[core 11[hwt
>> 0-1]], socket 1[core 12[hwt 0-1]], socket 1[core 13[hwt 0-1]], socket
>> 1[core 14[hwt 0-1]], socket 1[core 15[hwt 0-1]]:
>> [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB]
>> [csclprd3-0-10:19096] MCW rank 103 bound to socket 1[core 8[hwt 0-1]],
>> socket 1[core 9[hwt 0-1]], socket 1[core 10[hwt 0-1]], socket 1[core 11[hwt
>> 0-1]], socket 1[core 12[hwt 0-1]], socket 1[core 13[hwt 0-1]], socket
>> 1[core 14[hwt 0-1]], socket 1[core 15[hwt 0-1]]:
>> [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB]
>> [csclprd3-0-0:31079] MCW rank 8 bound to socket 0[core 0[hwt 0]], socket
>> 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], socket 0[core 3[hwt 0]], socket
>> 0[core 4[hwt 0]], socket 0[core 5[hwt 0]]: [B/B/B/B/B/B][./././././.]
>> [csclprd3-0-0:31079] MCW rank 9 bound to socket 1[core 6[hwt 0]], socket
>> 1[core 7[hwt 0]], socket 1[core 8[hwt 0]], socket 1[core 9[hwt 0]], socket
>> 1[core 10[hwt 0]], socket 1[core 11[hwt 0]]: [./././././.][B/B/B/B/B/B]
>> [csclprd3-0-10:19096] MCW rank 88 bound to socket 0[core 0[hwt 0-1]],
>> socket 0[core 1[hwt 0-1]], socket 0[core 2[hwt 0-1]], socket 0[core 3[hwt
>> 0-1]], socket 0[core 4[hwt 0-1]], socket 0[core 5[hwt 0-1]], socket 0[core
>> 6[hwt 0-1]], socket 0[core 7[hwt 0-1]]:
>> [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..]
>> [csclprd3-0-11:31636] MCW rank 119 bound to socket 1[core 8[hwt 0-1]],
>> socket 1[core 9[hwt 0-1]], socket 1[core 10[hwt 0-1]], socket 1[core 11[hwt
>> 0-1]], socket 1[core 12[hwt 0-1]], socket 1[core 13[hwt 0-1]], socket
>> 1[core 14[hwt 0-1]], socket 1[core 15[hwt 0-1]]:
>> [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB]
>> [csclprd3-0-7:20515] MCW rank 56 bound to socket 0[core 0[hwt 0-1]],
>> socket 0[core 1[hwt 0-1]], socket 0[core 2[hwt 0-1]], socket 0[core 3[hwt
>> 0-1]], socket 0[core 4[hwt 0-1]], socket 0[core 5[hwt 0-1]], socket 0[core
>> 6[hwt 0-1]], socket 0[core 7[hwt 0-1]]:
>> [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..]
>> [csclprd3-0-0:31079] MCW rank 10 bound to socket 0[core 0[hwt 0]], socket
>> 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], socket 0[core 3[hwt 0]], socket
>> 0[core 4[hwt 0]], socket 0[core 5[hwt 0]]: [B/B/B/B/B/B][./././././.]
>> [csclprd3-0-7:20515] MCW rank 57 bound to socket 1[core 8[hwt 0-1]],
>> socket 1[core 9[hwt 0-1]], socket 1[core 10[hwt 0-1]], socket 1[core 11[hwt
>> 0-1]], socket 1[core 12[hwt 0-1]], socket 1[core 13[hwt 0-1]], socket
>> 1[core 14[hwt 0-1]], socket 1[core 15[hwt 0-1]]:
>> [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB]
>> [csclprd3-0-10:19096] MCW rank 89 bound to socket 1[core 8[hwt 0-1]],
>> socket 1[core 9[hwt 0-1]], socket 1[core 10[hwt 0-1]], socket 1[core 11[hwt
>> 0-1]], socket 1[core 12[hwt 0-1]], socket 1[core 13[hwt 0-1]], socket
>> 1[core 14[hwt 0-1]], socket 1[core 15[hwt 0-1]]:
>> [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB]
>> [csclprd3-0-11:31636] MCW rank 104 bound to socket 0[core 0[hwt 0-1]],
>> socket 0[core 1[hwt 0-1]], socket 0[core 2[hwt 0-1]], socket 0[core 3[hwt
>> 0-1]], socket 0[core 4[hwt 0-1]], socket 0[core 5[hwt 0-1]], socket 0[core
>> 6[hwt 0-1]], socket 0[core 7[hwt 0-1]]:
>> [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..]
>> [csclprd3-0-0:31079] MCW rank 11 bound to socket 1[core 6[hwt 0]], socket
>> 1[core 7[hwt 0]], socket 1[core 8[hwt 0]], socket 1[core 9[hwt 0]], socket
>> 1[core 10[hwt 0]], socket 1[core 11[hwt 0]]: [./././././.][B/B/B/B/B/B]
>> [csclprd3-0-0:31079] MCW rank 12 bound to socket 0[core 0[hwt 0]], socket
>> 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], socket 0[core 3[hwt 0]], socket
>> 0[core 4[hwt 0]], socket 0[core 5[hwt 0]]: [B/B/B/B/B/B][./././././.]
>> [csclprd3-0-4:30348] MCW rank 42 is not bound (or bound to all
>>
>

Reply via email to