Here is what I see in my 1.8.5 build lib directory:

lrwxrwxrwx. 1 rhc       15 Apr 28 07:51 libmpi.so -> libmpi.so.1.6.0*
lrwxrwxrwx. 1 rhc       15 Apr 28 07:51 libmpi.so.1 -> libmpi.so.1.6.0*
-rwxr-xr-x. 1 rhc  1015923 Apr 28 07:51 libmpi.so.1.6.0*

So it should just be a link

> On Apr 28, 2015, at 10:30 AM, Lane, William <william.l...@cshs.org> wrote:
> 
> Ralph,
> 
> I copied the LAPACK benchmark binaries (xhpl being the binary) over to a 
> development system (which
> is running the same version of CentOS) but I'm getting some errors trying to 
> run the OpenMPI LAPACK benchmark
> on OpenMPI 1.8.5:
> 
> xhpl: error while loading shared libraries: libmpi.so.1: cannot open shared 
> object file: No such file or directory
> xhpl: error while loading shared libraries: libmpi.so.1: cannot open shared 
> object file: No such file or directory
> xhpl: error while loading shared libraries: libmpi.so.1: cannot open shared 
> object file: No such file or directory
> xhpl: error while loading shared libraries: libmpi.so.1: cannot open shared 
> object file: No such file or directory
> 
> When I look at the 1.8.5 install directory I find the following shared object 
> library but no libmpi.so.1
> 
> /apps/mpi/openmpi/1.8.5-dev/lib/libmpi.so
> /apps/mpi/openmpi/1.8.5-dev/lib/libmpi.so.0 
> 
> Is it necessary to re-compile the OpenMPI LAPACK benchmark to run OpenMPI 
> 1.8.5
> as opposed to 1.8.2?
> 
> -Bill L.
> 
> From: users [users-boun...@open-mpi.org] on behalf of Ralph Castain 
> [r...@open-mpi.org]
> Sent: Friday, April 10, 2015 5:28 PM
> To: Open MPI Users
> Subject: Re: [OMPI users] OpenMPI 1.8.2 problems on CentOS 6.3
> 
> This will be in the next nightly 1.8.5 tarball.
> 
> Bill: can you test it to see if we’ve fixed the problem?
> 
> Thanks
> Ralph
> 
> 
>> On Apr 10, 2015, at 2:15 PM, Ralph Castain <r...@open-mpi.org 
>> <mailto:r...@open-mpi.org>> wrote:
>> 
>> Okay, I at least now understand the behavior from this particular cmd line. 
>> Looks like we are binding-to-core by default, even if you specify 
>> use-hwthread-cpus. I’ll fix that one - still don’t understand the segfaults.
>> 
>> Bill: can you shed some light on those?
>> 
>> 
>>> On Apr 9, 2015, at 8:28 PM, Lane, William <william.l...@cshs.org 
>>> <mailto:william.l...@cshs.org>> wrote:
>>> 
>>> Ralph,
>>> 
>>> In looking at the /proc/cpuinfo textfile it looks like hyperthreading
>>> is enabled (in that it indicates 16 siblings for each of the 8 cores of the
>>> two LGA2011 CPU's). I don't have access to the BIOS on this system though
>>> so I'll have to check w/someone else.
>>> 
>>> I have done more testing and found that at 104 slots requested OpenMPI
>>> won't run the LAPACK benchmark. All the LGA2011 nodes exhibit the same
>>> strange binding behavior (maybe because hyperthreading is turned on for
>>> these nodes, but no the LGA 1366 nodes?). Below is all the relevant 
>>> information to
>>> that run:
>>> 
>>> II.
>>> a. $MPI_DIR/bin/mpirun -np $NSLOTS --report-bindings --hostfile 
>>> hostfile-single --mca btl_tcp_if_include eth0 --hetero-nodes 
>>> --use-hwthread-cpus --prefix $MPI_DIR $BENCH_DIR/$APP_DIR/$APP_BIN
>>> 
>>> where NSLOTS=104
>>> 
>>> b.
>>> [lanew@csclprd3s1 hpl]$ . /hpc/apps/benchmarks/runhpl4.job
>>> [csclprd3-6-1:27586] MCW rank 3 bound to socket 1[core 3[hwt 0]]: [./.][./B]
>>> [csclprd3-6-1:27586] MCW rank 0 bound to socket 0[core 0[hwt 0]]: [B/.][./.]
>>> [csclprd3-6-1:27586] MCW rank 1 bound to socket 1[core 2[hwt 0]]: [./.][B/.]
>>> [csclprd3-6-1:27586] MCW rank 2 bound to socket 0[core 1[hwt 0]]: [./B][./.]
>>> [csclprd3-0-2:04454] MCW rank 27 bound to socket 0[core 1[hwt 0]]: 
>>> [./B/./././.]
>>> [csclprd3-0-2:04454] MCW rank 28 bound to socket 0[core 2[hwt 0]]: 
>>> [././B/././.]
>>> [csclprd3-0-2:04454] MCW rank 29 bound to socket 0[core 3[hwt 0]]: 
>>> [./././B/./.]
>>> [csclprd3-0-2:04454] MCW rank 30 bound to socket 0[core 4[hwt 0]]: 
>>> [././././B/.]
>>> [csclprd3-0-2:04454] MCW rank 31 bound to socket 0[core 5[hwt 0]]: 
>>> [./././././B]
>>> [csclprd3-0-2:04454] MCW rank 26 bound to socket 0[core 0[hwt 0]]: 
>>> [B/././././.]
>>> [csclprd3-0-0:21129] MCW rank 8 bound to socket 0[core 0[hwt 0]]: 
>>> [B/././././.][./././././.]
>>> [csclprd3-0-0:21129] MCW rank 9 bound to socket 1[core 6[hwt 0]]: 
>>> [./././././.][B/././././.]
>>> [csclprd3-0-0:21129] MCW rank 10 bound to socket 0[core 1[hwt 0]]: 
>>> [./B/./././.][./././././.]
>>> [csclprd3-0-0:21129] MCW rank 11 bound to socket 1[core 7[hwt 0]]: 
>>> [./././././.][./B/./././.]
>>> [csclprd3-0-0:21129] MCW rank 12 bound to socket 0[core 2[hwt 0]]: 
>>> [././B/././.][./././././.]
>>> [csclprd3-0-0:21129] MCW rank 13 bound to socket 1[core 8[hwt 0]]: 
>>> [./././././.][././B/././.]
>>> [csclprd3-0-0:21129] MCW rank 14 bound to socket 0[core 3[hwt 0]]: 
>>> [./././B/./.][./././././.]
>>> [csclprd3-0-0:21129] MCW rank 15 bound to socket 1[core 9[hwt 0]]: 
>>> [./././././.][./././B/./.]
>>> [csclprd3-0-0:21129] MCW rank 16 bound to socket 0[core 4[hwt 0]]: 
>>> [././././B/.][./././././.]
>>> [csclprd3-0-0:21129] MCW rank 17 bound to socket 1[core 10[hwt 0]]: 
>>> [./././././.][././././B/.]
>>> [csclprd3-0-0:21129] MCW rank 18 bound to socket 0[core 5[hwt 0]]: 
>>> [./././././B][./././././.]
>>> [csclprd3-0-0:21129] MCW rank 19 bound to socket 1[core 11[hwt 0]]: 
>>> [./././././.][./././././B]
>>> [csclprd3-0-1:12882] MCW rank 22 bound to socket 0[core 2[hwt 0]]: 
>>> [././B/././.]
>>> [csclprd3-0-1:12882] MCW rank 23 bound to socket 0[core 3[hwt 0]]: 
>>> [./././B/./.]
>>> [csclprd3-0-1:12882] MCW rank 24 bound to socket 0[core 4[hwt 0]]: 
>>> [././././B/.]
>>> [csclprd3-0-1:12882] MCW rank 25 bound to socket 0[core 5[hwt 0]]: 
>>> [./././././B]
>>> [csclprd3-0-1:12882] MCW rank 20 bound to socket 0[core 0[hwt 0]]: 
>>> [B/././././.]
>>> [csclprd3-0-1:12882] MCW rank 21 bound to socket 0[core 1[hwt 0]]: 
>>> [./B/./././.]
>>> [csclprd3-0-9:27268] MCW rank 92 bound to socket 0[core 2[hwt 0-1]]: 
>>> [../../BB/../../../../..][../../../../../../../..]
>>> [csclprd3-0-9:27268] MCW rank 93 bound to socket 1[core 10[hwt 0-1]]: 
>>> [../../../../../../../..][../../BB/../../../../..]
>>> [csclprd3-0-9:27268] MCW rank 94 bound to socket 0[core 3[hwt 0-1]]: 
>>> [../../../BB/../../../..][../../../../../../../..]
>>> [csclprd3-0-9:27268] MCW rank 95 bound to socket 1[core 11[hwt 0-1]]: 
>>> [../../../../../../../..][../../../BB/../../../..]
>>> [csclprd3-0-9:27268] MCW rank 96 bound to socket 0[core 4[hwt 0-1]]: 
>>> [../../../../BB/../../..][../../../../../../../..]
>>> [csclprd3-0-9:27268] MCW rank 97 bound to socket 1[core 12[hwt 0-1]]: 
>>> [../../../../../../../..][../../../../BB/../../..]
>>> [csclprd3-0-8:22880] MCW rank 84 bound to socket 0[core 6[hwt 0-1]]: 
>>> [../../../../../../BB/..][../../../../../../../..]
>>> [csclprd3-0-8:22880] MCW rank 85 bound to socket 1[core 14[hwt 0-1]]: 
>>> [../../../../../../../..][../../../../../../BB/..]
>>> [csclprd3-0-8:22880] MCW rank 86 bound to socket 0[core 7[hwt 0-1]]: 
>>> [../../../../../../../BB][../../../../../../../..]
>>> [csclprd3-0-8:22880] MCW rank 87 bound to socket 1[core 15[hwt 0-1]]: 
>>> [../../../../../../../..][../../../../../../../BB]
>>> [csclprd3-0-8:22880] MCW rank 72 bound to socket 0[core 0[hwt 0-1]]: 
>>> [BB/../../../../../../..][../../../../../../../..]
>>> [csclprd3-0-8:22880] MCW rank 73 bound to socket 1[core 8[hwt 0-1]]: 
>>> [../../../../../../../..][BB/../../../../../../..]
>>> [csclprd3-0-8:22880] MCW rank 74 bound to socket 0[core 1[hwt 0-1]]: 
>>> [../BB/../../../../../..][../../../../../../../..]
>>> [csclprd3-0-8:22880] MCW rank 75 bound to socket 1[core 9[hwt 0-1]]: 
>>> [../../../../../../../..][../BB/../../../../../..]
>>> [csclprd3-0-8:22880] MCW rank 76 bound to socket 0[core 2[hwt 0-1]]: 
>>> [../../BB/../../../../..][../../../../../../../..]
>>> [csclprd3-0-8:22880] MCW rank 77 bound to socket 1[core 10[hwt 0-1]]: 
>>> [../../../../../../../..][../../BB/../../../../..]
>>> [csclprd3-0-8:22880] MCW rank 78 bound to socket 0[core 3[hwt 0-1]]: 
>>> [../../../BB/../../../..][../../../../../../../..]
>>> [csclprd3-6-5:18000] MCW rank 7 bound to socket 1[core 3[hwt 0]]: [./.][./B]
>>> [csclprd3-6-5:18000] MCW rank 4 bound to socket 0[core 0[hwt 0]]: [B/.][./.]
>>> [csclprd3-6-5:18000] MCW rank 5 bound to socket 1[core 2[hwt 0]]: [./.][B/.]
>>> [csclprd3-6-5:18000] MCW rank 6 bound to socket 0[core 1[hwt 0]]: [./B][./.]
>>> [csclprd3-0-7:08058] MCW rank 60 bound to socket 0[core 2[hwt 0-1]]: 
>>> [../../BB/../../../../..][../../../../../../../..]
>>> [csclprd3-0-7:08058] MCW rank 61 bound to socket 1[core 10[hwt 0-1]]: 
>>> [../../../../../../../..][../../BB/../../../../..]
>>> [csclprd3-0-7:08058] MCW rank 62 bound to socket 0[core 3[hwt 0-1]]: 
>>> [../../../BB/../../../..][../../../../../../../..]
>>> [csclprd3-0-7:08058] MCW rank 63 bound to socket 1[core 11[hwt 0-1]]: 
>>> [../../../../../../../..][../../../BB/../../../..]
>>> [csclprd3-0-7:08058] MCW rank 64 bound to socket 0[core 4[hwt 0-1]]: 
>>> [../../../../BB/../../..][../../../../../../../..]
>>> [csclprd3-0-7:08058] MCW rank 65 bound to socket 1[core 12[hwt 0-1]]: 
>>> [../../../../../../../..][../../../../BB/../../..]
>>> [csclprd3-0-7:08058] MCW rank 66 bound to socket 0[core 5[hwt 0-1]]: 
>>> [../../../../../BB/../..][../../../../../../../..]
>>> [csclprd3-0-7:08058] MCW rank 67 bound to socket 1[core 13[hwt 0-1]]: 
>>> [../../../../../../../..][../../../../../BB/../..]
>>> [csclprd3-0-7:08058] MCW rank 68 bound to socket 0[core 6[hwt 0-1]]: 
>>> [../../../../../../BB/..][../../../../../../../..]
>>> [csclprd3-0-7:08058] MCW rank 69 bound to socket 1[core 14[hwt 0-1]]: 
>>> [../../../../../../../..][../../../../../../BB/..]
>>> [csclprd3-0-7:08058] MCW rank 70 bound to socket 0[core 7[hwt 0-1]]: 
>>> [../../../../../../../BB][../../../../../../../..]
>>> [csclprd3-0-7:08058] MCW rank 71 bound to socket 1[core 15[hwt 0-1]]: 
>>> [../../../../../../../..][../../../../../../../BB]
>>> [csclprd3-0-7:08058] MCW rank 56 bound to socket 0[core 0[hwt 0-1]]: 
>>> [BB/../../../../../../..][../../../../../../../..]
>>> [csclprd3-0-7:08058] MCW rank 57 bound to socket 1[core 8[hwt 0-1]]: 
>>> [../../../../../../../..][BB/../../../../../../..]
>>> [csclprd3-0-7:08058] MCW rank 58 bound to socket 0[core 1[hwt 0-1]]: 
>>> [../BB/../../../../../..][../../../../../../../..]
>>> [csclprd3-0-7:08058] MCW rank 59 bound to socket 1[core 9[hwt 0-1]]: 
>>> [../../../../../../../..][../BB/../../../../../..]
>>> [csclprd3-0-5:15446] MCW rank 44 bound to socket 0[core 0[hwt 0]]: 
>>> [B/././././.]
>>> [csclprd3-0-5:15446] MCW rank 45 bound to socket 0[core 1[hwt 0]]: 
>>> [./B/./././.]
>>> [csclprd3-0-5:15446] MCW rank 46 bound to socket 0[core 2[hwt 0]]: 
>>> [././B/././.]
>>> [csclprd3-0-5:15446] MCW rank 47 bound to socket 0[core 3[hwt 0]]: 
>>> [./././B/./.]
>>> [csclprd3-0-5:15446] MCW rank 48 bound to socket 0[core 4[hwt 0]]: 
>>> [././././B/.]
>>> [csclprd3-0-5:15446] MCW rank 49 bound to socket 0[core 5[hwt 0]]: 
>>> [./././././B]
>>> [csclprd3-0-9:27268] MCW rank 98 bound to socket 0[core 5[hwt 0-1]]: 
>>> [../../../../../BB/../..][../../../../../../../..]
>>> [csclprd3-0-9:27268] MCW rank 99 bound to socket 1[core 13[hwt 0-1]]: 
>>> [../../../../../../../..][../../../../../BB/../..]
>>> [csclprd3-0-9:27268] MCW rank 100 bound to socket 0[core 6[hwt 0-1]]: 
>>> [../../../../../../BB/..][../../../../../../../..]
>>> [csclprd3-0-9:27268] MCW rank 101 bound to socket 1[core 14[hwt 0-1]]: 
>>> [../../../../../../../..][../../../../../../BB/..]
>>> [csclprd3-0-9:27268] MCW rank 102 bound to socket 0[core 7[hwt 0-1]]: 
>>> [../../../../../../../BB][../../../../../../../..]
>>> [csclprd3-0-9:27268] MCW rank 103 bound to socket 1[core 15[hwt 0-1]]: 
>>> [../../../../../../../..][../../../../../../../BB]
>>> [csclprd3-0-9:27268] MCW rank 88 bound to socket 0[core 0[hwt 0-1]]: 
>>> [BB/../../../../../../..][../../../../../../../..]
>>> [csclprd3-0-9:27268] MCW rank 89 bound to socket 1[core 8[hwt 0-1]]: 
>>> [../../../../../../../..][BB/../../../../../../..]
>>> [csclprd3-0-9:27268] MCW rank 90 bound to socket 0[core 1[hwt 0-1]]: 
>>> [../BB/../../../../../..][../../../../../../../..]
>>> [csclprd3-0-9:27268] MCW rank 91 bound to socket 1[core 9[hwt 0-1]]: 
>>> [../../../../../../../..][../BB/../../../../../..]
>>> [csclprd3-0-6:28854] MCW rank 51 bound to socket 0[core 1[hwt 0]]: 
>>> [./B/./././.]
>>> [csclprd3-0-6:28854] MCW rank 52 bound to socket 0[core 2[hwt 0]]: 
>>> [././B/././.]
>>> [csclprd3-0-6:28854] MCW rank 53 bound to socket 0[core 3[hwt 0]]: 
>>> [./././B/./.]
>>> [csclprd3-0-6:28854] MCW rank 54 bound to socket 0[core 4[hwt 0]]: 
>>> [././././B/.]
>>> [csclprd3-0-6:28854] MCW rank 55 bound to socket 0[core 5[hwt 0]]: 
>>> [./././././B]
>>> [csclprd3-0-6:28854] MCW rank 50 bound to socket 0[core 0[hwt 0]]: 
>>> [B/././././.]
>>> [csclprd3-0-4:28072] MCW rank 38 bound to socket 0[core 0[hwt 0]]: 
>>> [B/././././.]
>>> [csclprd3-0-4:28072] MCW rank 39 bound to socket 0[core 1[hwt 0]]: 
>>> [./B/./././.]
>>> [csclprd3-0-4:28072] MCW rank 40 bound to socket 0[core 2[hwt 0]]: 
>>> [././B/././.]
>>> [csclprd3-0-4:28072] MCW rank 41 bound to socket 0[core 3[hwt 0]]: 
>>> [./././B/./.]
>>> [csclprd3-0-4:28072] MCW rank 42 bound to socket 0[core 4[hwt 0]]: 
>>> [././././B/.]
>>> [csclprd3-0-4:28072] MCW rank 43 bound to socket 0[core 5[hwt 0]]: 
>>> [./././././B]
>>> [csclprd3-0-8:22880] MCW rank 79 bound to socket 1[core 11[hwt 0-1]]: 
>>> [../../../../../../../..][../../../BB/../../../..]
>>> [csclprd3-0-8:22880] MCW rank 80 bound to socket 0[core 4[hwt 0-1]]: 
>>> [../../../../BB/../../..][../../../../../../../..]
>>> [csclprd3-0-8:22880] MCW rank 81 bound to socket 1[core 12[hwt 0-1]]: 
>>> [../../../../../../../..][../../../../BB/../../..]
>>> [csclprd3-0-8:22880] MCW rank 82 bound to socket 0[core 5[hwt 0-1]]: 
>>> [../../../../../BB/../..][../../../../../../../..]
>>> [csclprd3-0-8:22880] MCW rank 83 bound to socket 1[core 13[hwt 0-1]]: 
>>> [../../../../../../../..][../../../../../BB/../..]
>>> [csclprd3-0-3:11837] MCW rank 33 bound to socket 0[core 1[hwt 0]]: 
>>> [./B/./././.]
>>> [csclprd3-0-3:11837] MCW rank 34 bound to socket 0[core 2[hwt 0]]: 
>>> [././B/././.]
>>> [csclprd3-0-3:11837] MCW rank 35 bound to socket 0[core 3[hwt 0]]: 
>>> [./././B/./.]
>>> [csclprd3-0-3:11837] MCW rank 36 bound to socket 0[core 4[hwt 0]]: 
>>> [././././B/.]
>>> [csclprd3-0-3:11837] MCW rank 37 bound to socket 0[core 5[hwt 0]]: 
>>> [./././././B]
>>> [csclprd3-0-3:11837] MCW rank 32 bound to socket 0[core 0[hwt 0]]: 
>>> [B/././././.]
>>> [csclprd3-0-9:27275] *** Process received signal ***
>>> [csclprd3-0-9:27275] Signal: Bus error (7)
>>> [csclprd3-0-9:27275] Signal code: Non-existant physical address (2)
>>> [csclprd3-0-9:27275] Failing at address: 0x7f1215253000
>>> [csclprd3-0-9:27276] *** Process received signal ***
>>> [csclprd3-0-9:27276] Signal: Bus error (7)
>>> [csclprd3-0-9:27276] Signal code: Non-existant physical address (2)
>>> [csclprd3-0-9:27276] Failing at address: 0x7f563e874380
>>> [csclprd3-0-9:27277] *** Process received signal ***
>>> [csclprd3-0-9:27277] Signal: Bus error (7)
>>> [csclprd3-0-9:27277] Signal code: Non-existant physical address (2)
>>> [csclprd3-0-9:27277] Failing at address: 0x7fbbec79a300
>>> [csclprd3-0-9:27278] *** Process received signal ***
>>> [csclprd3-0-9:27278] Signal: Bus error (7)
>>> [csclprd3-0-9:27278] Signal code: Non-existant physical address (2)
>>> [csclprd3-0-9:27278] Failing at address: 0x7fbadf816080
>>> [csclprd3-0-9:27279] *** Process received signal ***
>>> [csclprd3-0-9:27279] Signal: Bus error (7)
>>> [csclprd3-0-9:27279] Signal code: Non-existant physical address (2)
>>> [csclprd3-0-9:27279] Failing at address: 0x7fab5dfa0100
>>> [csclprd3-0-9:27280] *** Process received signal ***
>>> [csclprd3-0-9:27280] Signal: Bus error (7)
>>> [csclprd3-0-9:27280] Signal code: Non-existant physical address (2)
>>> [csclprd3-0-9:27280] Failing at address: 0x7f0bb4034500
>>> [csclprd3-0-9:27281] *** Process received signal ***
>>> [csclprd3-0-9:27281] Signal: Bus error (7)
>>> [csclprd3-0-9:27281] Signal code: Non-existant physical address (2)
>>> [csclprd3-0-9:27281] Failing at address: 0x7f49bb544f80
>>> [csclprd3-0-9:27282] *** Process received signal ***
>>> [csclprd3-0-9:27282] Signal: Bus error (7)
>>> [csclprd3-0-9:27282] Signal code: Non-existant physical address (2)
>>> [csclprd3-0-9:27282] Failing at address: 0x7fe647f61f00
>>> [csclprd3-0-9:27283] *** Process received signal ***
>>> [csclprd3-0-9:27283] Signal: Bus error (7)
>>> [csclprd3-0-9:27283] Signal code: Non-existant physical address (2)
>>> [csclprd3-0-9:27283] Failing at address: 0x7f79a9d25580
>>> [csclprd3-0-9:27272] *** Process received signal ***
>>> [csclprd3-0-9:27272] Signal: Bus error (7)
>>> [csclprd3-0-9:27272] Signal code: Non-existant physical address (2)
>>> [csclprd3-0-9:27272] Failing at address: 0x7f64568adf80
>>> [csclprd3-0-9:27269] *** Process received signal ***
>>> [csclprd3-0-9:27269] Signal: Bus error (7)
>>> [csclprd3-0-9:27269] Signal code: Non-existant physical address (2)
>>> [csclprd3-0-9:27269] Failing at address: 0x7f5e2a17e580
>>> [csclprd3-0-9:27270] *** Process received signal ***
>>> [csclprd3-0-9:27270] Signal: Bus error (7)
>>> [csclprd3-0-9:27270] Signal code: Non-existant physical address (2)
>>> [csclprd3-0-9:27270] Failing at address: 0x7fda95421400
>>> [csclprd3-0-9:27271] *** Process received signal ***
>>> [csclprd3-0-9:27271] Signal: Bus error (7)
>>> [csclprd3-0-9:27271] Signal code: Non-existant physical address (2)
>>> [csclprd3-0-9:27271] Failing at address: 0x7f873e76c100
>>> [csclprd3-0-9:27271] [ 0] [csclprd3-0-9:27273] *** Process received signal 
>>> ***
>>> [csclprd3-0-9:27273] Signal: Bus error (7)
>>> [csclprd3-0-9:27273] Signal code: Non-existant physical address (2)
>>> [csclprd3-0-9:27273] Failing at address: 0x7f5dc6e99e80
>>> [csclprd3-0-9:27274] *** Process received signal ***
>>> [csclprd3-0-9:27274] Signal: Bus error (7)
>>> [csclprd3-0-9:27274] Signal code: Non-existant physical address (2)
>>> [csclprd3-0-9:27274] Failing at address: 0x7f83afce2280
>>> [csclprd3-0-9:27274] [ 0] [csclprd3-0-9:27269] [ 0] 
>>> /lib64/libc.so.6(+0x32920)[0x7f5e39b82920]
>>> [csclprd3-0-9:27269] [ 1] 
>>> /hpc/apps/mpi/openmpi/1.8.2/lib/openmpi/mca_btl_sm.so(+0x511a)[0x7f5e2f5f111a]
>>> [csclprd3-0-9:27269] [ 2] 
>>> /hpc/apps/mpi/openmpi/1.8.2/lib/libmpi.so.1(ompi_free_list_grow+0x239)[0x7f5e3a6960c9]
>>> [csclprd3-0-9:27269] [ 3] 
>>> /hpc/apps/mpi/openmpi/1.8.2/lib/libmpi.so.1(ompi_free_list_resize_mt+0x40)[0x7f5e3a696200]
>>> [csclprd3-0-9:27269] [ 4] 
>>> /hpc/apps/mpi/openmpi/1.8.2/lib/openmpi/mca_bml_r2.so(+0x138e)[0x7f5e2f9f838e]
>>> [csclprd3-0-9:27269] [ 5] 
>>> /hpc/apps/mpi/openmpi/1.8.2/lib/openmpi/mca_pml_ob1.so(mca_pml_ob1_add_procs+0xd5)[0x7f5e2eb204e5]
>>> [csclprd3-0-9:27269] [ 6] 
>>> /hpc/apps/mpi/openmpi/1.8.2/lib/libmpi.so.1(ompi_mpi_init+0x8d6)[0x7f5e3a6b0e26]
>>> [csclprd3-0-9:27269] [ 7] 
>>> /hpc/apps/mpi/openmpi/1.8.2/lib/libmpi.so.1(MPI_Init+0x170)[0x7f5e3a6cfe10]
>>> [csclprd3-0-9:27269] [ 8] /hpc/apps/benchmarks/hpl/xhpl[0x401571]
>>> [csclprd3-0-9:27269] [ 9] 
>>> /lib64/libc.so.6(__libc_start_main+0xfd)[0x7f5e39b6ecdd]
>>> [csclprd3-0-9:27269] [10] /hpc/apps/benchmarks/hpl/xhpl[0x401439]
>>> [csclprd3-0-9:27269] *** End of error message ***
>>> [csclprd3-0-9:27270] [ 0] /lib64/libc.so.6(+0x32920)[0x7fdaa8d89920]
>>> [csclprd3-0-9:27270] [ 1] 
>>> /hpc/apps/mpi/openmpi/1.8.2/lib/openmpi/mca_btl_sm.so(+0x511a)[0x7fdaa29d711a]
>>> [csclprd3-0-9:27270] [ 2] 
>>> /hpc/apps/mpi/openmpi/1.8.2/lib/libmpi.so.1(ompi_free_list_grow+0x239)[0x7fdaa989d0c9]
>>> [csclprd3-0-9:27270] [ 3] 
>>> /hpc/apps/mpi/openmpi/1.8.2/lib/libmpi.so.1(ompi_free_list_resize_mt+0x40)[0x7fdaa989d200]
>>> [csclprd3-0-9:27270] [ 4] 
>>> /hpc/apps/mpi/openmpi/1.8.2/lib/openmpi/mca_bml_r2.so(+0x138e)[0x7fdaa2dde38e]
>>> [csclprd3-0-9:27270] [ 5] 
>>> /hpc/apps/mpi/openmpi/1.8.2/lib/openmpi/mca_pml_ob1.so(mca_pml_ob1_add_procs+0xd5)[0x7fdaa1f064e5]
>>> [csclprd3-0-9:27270] [ 6] 
>>> /hpc/apps/mpi/openmpi/1.8.2/lib/libmpi.so.1(ompi_mpi_init+0x8d6)[0x7fdaa98b7e26]
>>> [csclprd3-0-9:27270] [ 7] 
>>> /hpc/apps/mpi/openmpi/1.8.2/lib/libmpi.so.1(MPI_Init+0x170)[0x7fdaa98d6e10]
>>> [csclprd3-0-9:27270] [ 8] /hpc/apps/benchmarks/hpl/xhpl[0x401571]
>>> [csclprd3-0-9:27270] [ 9] 
>>> /lib64/libc.so.6(__libc_start_main+0xfd)[0x7fdaa8d75cdd]
>>> [csclprd3-0-9:27270] [10] /hpc/apps/benchmarks/hpl/xhpl[0x401439]
>>> [csclprd3-0-9:27270] *** End of error message ***
>>> /lib64/libc.so.6(+0x32920)[0x7f875211a920]
>>> [csclprd3-0-9:27271] [ 1] 
>>> /hpc/apps/mpi/openmpi/1.8.2/lib/openmpi/mca_btl_sm.so(+0x511a)[0x7f8747dfe11a]
>>> [csclprd3-0-9:27271] [ 2] 
>>> /hpc/apps/mpi/openmpi/1.8.2/lib/libmpi.so.1(ompi_free_list_grow+0x239)[0x7f8752c2e0c9]
>>> [csclprd3-0-9:27271] [ 3] 
>>> /hpc/apps/mpi/openmpi/1.8.2/lib/libmpi.so.1(ompi_free_list_resize_mt+0x40)[0x7f8752c2e200]
>>> [csclprd3-0-9:27271] [ 4] 
>>> /hpc/apps/mpi/openmpi/1.8.2/lib/openmpi/mca_bml_r2.so(+0x138e)[0x7f874c20538e]
>>> [csclprd3-0-9:27271] [ 5] 
>>> /hpc/apps/mpi/openmpi/1.8.2/lib/openmpi/mca_pml_ob1.so(mca_pml_ob1_add_procs+0xd5)[0x7f874732d4e5]
>>> [csclprd3-0-9:27271] [ 6] 
>>> /hpc/apps/mpi/openmpi/1.8.2/lib/libmpi.so.1(ompi_mpi_init+0x8d6)[0x7f8752c48e26]
>>> [csclprd3-0-9:27271] [ 7] 
>>> /hpc/apps/mpi/openmpi/1.8.2/lib/libmpi.so.1(MPI_Init+0x170)[0x7f8752c67e10]
>>> [csclprd3-0-9:27271] [ 8] /hpc/apps/benchmarks/hpl/xhpl[0x401571]
>>> [csclprd3-0-9:27271] [ 9] 
>>> /lib64/libc.so.6(__libc_start_main+0xfd)[0x7f8752106cdd]
>>> [csclprd3-0-9:27271] [10] /hpc/apps/benchmarks/hpl/xhpl[0x401439]
>>> [csclprd3-0-9:27271] *** End of error message ***
>>> [csclprd3-0-9:27273] [ 0] /lib64/libc.so.6(+0x32920)[0x7f5ddaa3c920]
>>> [csclprd3-0-9:27273] [ 1] 
>>> /hpc/apps/mpi/openmpi/1.8.2/lib/openmpi/mca_btl_sm.so(+0x511a)[0x7f5dd47ae11a]
>>> [csclprd3-0-9:27273] [ 2] 
>>> /hpc/apps/mpi/openmpi/1.8.2/lib/libmpi.so.1(ompi_free_list_grow+0x239)[0x7f5ddb5500c9]
>>> [csclprd3-0-9:27273] [ 3] 
>>> /hpc/apps/mpi/openmpi/1.8.2/lib/libmpi.so.1(ompi_free_list_resize_mt+0x40)[0x7f5ddb550200]
>>> [csclprd3-0-9:27273] [ 4] 
>>> /hpc/apps/mpi/openmpi/1.8.2/lib/openmpi/mca_bml_r2.so(+0x138e)[0x7f5dd4bb538e]
>>> [csclprd3-0-9:27273] [ 5] 
>>> /hpc/apps/mpi/openmpi/1.8.2/lib/openmpi/mca_pml_ob1.so(mca_pml_ob1_add_procs+0xd5)[0x7f5dcfbe54e5]
>>> [csclprd3-0-9:27273] [ 6] 
>>> /hpc/apps/mpi/openmpi/1.8.2/lib/libmpi.so.1(ompi_mpi_init+0x8d6)[0x7f5ddb56ae26]
>>> [csclprd3-0-9:27273] [ 7] 
>>> /hpc/apps/mpi/openmpi/1.8.2/lib/libmpi.so.1(MPI_Init+0x170)[0x7f5ddb589e10]
>>> [csclprd3-0-9:27273] [ 8] /hpc/apps/benchmarks/hpl/xhpl[0x401571]
>>> [csclprd3-0-9:27273] [ 9] 
>>> /lib64/libc.so.6(__libc_start_main+0xfd)[0x7f5ddaa28cdd]
>>> [csclprd3-0-9:27273] [10] /hpc/apps/benchmarks/hpl/xhpl[0x401439]
>>> [csclprd3-0-9:27273] *** End of error message ***
>>> [csclprd3-0-9:27275] [ 0] /lib64/libc.so.6(+0x32920)[0x7f1228ede920]
>>> [csclprd3-0-9:27275] [ 1] 
>>> /hpc/apps/mpi/openmpi/1.8.2/lib/openmpi/mca_btl_sm.so(+0x511a)[0x7f1222bac11a]
>>> [csclprd3-0-9:27275] [ 2] 
>>> /hpc/apps/mpi/openmpi/1.8.2/lib/libmpi.so.1(ompi_free_list_grow+0x239)[0x7f12299f20c9]
>>> [csclprd3-0-9:27275] [ 3] 
>>> /hpc/apps/mpi/openmpi/1.8.2/lib/libmpi.so.1(ompi_free_list_resize_mt+0x40)[0x7f12299f2200]
>>> [csclprd3-0-9:27275] [ 4] 
>>> /hpc/apps/mpi/openmpi/1.8.2/lib/openmpi/mca_bml_r2.so(+0x138e)[0x7f1222fb338e]
>>> [csclprd3-0-9:27275] [ 5] 
>>> /hpc/apps/mpi/openmpi/1.8.2/lib/openmpi/mca_pml_ob1.so(mca_pml_ob1_add_procs+0xd5)[0x7f12220db4e5]
>>> [csclprd3-0-9:27275] [ 6] 
>>> /hpc/apps/mpi/openmpi/1.8.2/lib/libmpi.so.1(ompi_mpi_init+0x8d6)[0x7f1229a0ce26]
>>> [csclprd3-0-9:27275] [ 7] 
>>> /hpc/apps/mpi/openmpi/1.8.2/lib/libmpi.so.1(MPI_Init+0x170)[0x7f1229a2be10]
>>> [csclprd3-0-9:27275] [ 8] /hpc/apps/benchmarks/hpl/xhpl[0x401571]
>>> [csclprd3-0-9:27275] [ 9] 
>>> /lib64/libc.so.6(__libc_start_main+0xfd)[0x7f1228ecacdd]
>>> [csclprd3-0-9:27275] [10] /hpc/apps/benchmarks/hpl/xhpl[0x401439]
>>> [csclprd3-0-9:27275] *** End of error message ***
>>> [csclprd3-0-9:27276] [ 0] /lib64/libc.so.6(+0x32920)[0x7f565218a920]
>>> [csclprd3-0-9:27276] [ 1] 
>>> /hpc/apps/mpi/openmpi/1.8.2/lib/openmpi/mca_btl_sm.so(+0x511a)[0x7f5647dfe11a]
>>> [csclprd3-0-9:27276] [ 2] 
>>> /hpc/apps/mpi/openmpi/1.8.2/lib/libmpi.so.1(ompi_free_list_grow+0x239)[0x7f5652c9e0c9]
>>> [csclprd3-0-9:27276] [ 3] 
>>> /hpc/apps/mpi/openmpi/1.8.2/lib/libmpi.so.1(ompi_free_list_resize_mt+0x40)[0x7f5652c9e200]
>>> [csclprd3-0-9:27276] [ 4] 
>>> /hpc/apps/mpi/openmpi/1.8.2/lib/openmpi/mca_bml_r2.so(+0x138e)[0x7f564c2f738e]
>>> [csclprd3-0-9:27276] [ 5] 
>>> /hpc/apps/mpi/openmpi/1.8.2/lib/openmpi/mca_pml_ob1.so(mca_pml_ob1_add_procs+0xd5)[0x7f564732d4e5]
>>> [csclprd3-0-9:27276] [ 6] 
>>> /hpc/apps/mpi/openmpi/1.8.2/lib/libmpi.so.1(ompi_mpi_init+0x8d6)[0x7f5652cb8e26]
>>> [csclprd3-0-9:27276] [ 7] 
>>> /hpc/apps/mpi/openmpi/1.8.2/lib/libmpi.so.1(MPI_Init+0x170)[0x7f5652cd7e10]
>>> [csclprd3-0-9:27276] [ 8] /hpc/apps/benchmarks/hpl/xhpl[0x401571]
>>> [csclprd3-0-9:27276] [ 9] 
>>> /lib64/libc.so.6(__libc_start_main+0xfd)[0x7f5652176cdd]
>>> [csclprd3-0-9:27276] [10] /hpc/apps/benchmarks/hpl/xhpl[0x401439]
>>> [csclprd3-0-9:27276] *** End of error message ***
>>> [csclprd3-0-9:27277] [ 0] /lib64/libc.so.6(+0x32920)[0x7fbc00059920]
>>> [csclprd3-0-9:27277] [ 1] 
>>> /hpc/apps/mpi/openmpi/1.8.2/lib/openmpi/mca_btl_sm.so(+0x511a)[0x7fbbf9de811a]
>>> [csclprd3-0-9:27277] [ 2] 
>>> /hpc/apps/mpi/openmpi/1.8.2/lib/libmpi.so.1(ompi_free_list_grow+0x239)[0x7fbc00b6d0c9]
>>> [csclprd3-0-9:27277] [ 3] 
>>> /hpc/apps/mpi/openmpi/1.8.2/lib/libmpi.so.1(ompi_free_list_resize_mt+0x40)[0x7fbc00b6d200]
>>> [csclprd3-0-9:27277] [ 4] 
>>> /hpc/apps/mpi/openmpi/1.8.2/lib/openmpi/mca_bml_r2.so(+0x138e)[0x7fbbfa1ef38e]
>>> [csclprd3-0-9:27277] [ 5] 
>>> /hpc/apps/mpi/openmpi/1.8.2/lib/openmpi/mca_pml_ob1.so(mca_pml_ob1_add_procs+0xd5)[0x7fbbf93174e5]
>>> [csclprd3-0-9:27277] [ 6] 
>>> /hpc/apps/mpi/openmpi/1.8.2/lib/libmpi.so.1(ompi_mpi_init+0x8d6)[0x7fbc00b87e26]
>>> [csclprd3-0-9:27277] [ 7] 
>>> /hpc/apps/mpi/openmpi/1.8.2/lib/libmpi.so.1(MPI_Init+0x170)[0x7fbc00ba6e10]
>>> [csclprd3-0-9:27277] [ 8] /hpc/apps/benchmarks/hpl/xhpl[0x401571]
>>> [csclprd3-0-9:27277] [ 9] 
>>> /lib64/libc.so.6(__libc_start_main+0xfd)[0x7fbc00045cdd]
>>> [csclprd3-0-9:27277] [10] /hpc/apps/benchmarks/hpl/xhpl[0x401439]
>>> [csclprd3-0-9:27277] *** End of error message ***
>>> [csclprd3-0-9:27278] [ 0] /lib64/libc.so.6(+0x32920)[0x7fbaf33a1920]
>>> [csclprd3-0-9:27278] [ 1] 
>>> /hpc/apps/mpi/openmpi/1.8.2/lib/openmpi/mca_btl_sm.so(+0x511a)[0x7fbaed0a211a]
>>> [csclprd3-0-9:27278] [ 2] 
>>> /hpc/apps/mpi/openmpi/1.8.2/lib/libmpi.so.1(ompi_free_list_grow+0x239)[0x7fbaf3eb50c9]
>>> [csclprd3-0-9:27278] [ 3] 
>>> /hpc/apps/mpi/openmpi/1.8.2/lib/libmpi.so.1(ompi_free_list_resize_mt+0x40)[0x7fbaf3eb5200]
>>> [csclprd3-0-9:27278] [ 4] 
>>> /hpc/apps/mpi/openmpi/1.8.2/lib/openmpi/mca_bml_r2.so(+0x138e)[0x7fbaed4a938e]
>>> [csclprd3-0-9:27278] [ 5] 
>>> /hpc/apps/mpi/openmpi/1.8.2/lib/openmpi/mca_pml_ob1.so(mca_pml_ob1_add_procs+0xd5)[0x7fbaec5d14e5]
>>> [csclprd3-0-9:27278] [ 6] 
>>> /hpc/apps/mpi/openmpi/1.8.2/lib/libmpi.so.1(ompi_mpi_init+0x8d6)[0x7fbaf3ecfe26]
>>> [csclprd3-0-9:27278] [ 7] 
>>> /hpc/apps/mpi/openmpi/1.8.2/lib/libmpi.so.1(MPI_Init+0x170)[0x7fbaf3eeee10]
>>> [csclprd3-0-9:27278] [ 8] /hpc/apps/benchmarks/hpl/xhpl[0x401571]
>>> [csclprd3-0-9:27278] [ 9] 
>>> /lib64/libc.so.6(__libc_start_main+0xfd)[0x7fbaf338dcdd]
>>> [csclprd3-0-9:27278] [10] /hpc/apps/benchmarks/hpl/xhpl[0x401439]
>>> [csclprd3-0-9:27278] *** End of error message ***
>>> [csclprd3-0-9:27279] [ 0] /lib64/libc.so.6(+0x32920)[0x7fab71930920]
>>> [csclprd3-0-9:27279] [ 1] 
>>> /hpc/apps/mpi/openmpi/1.8.2/lib/openmpi/mca_btl_sm.so(+0x511a)[0x7fab675f111a]
>>> [csclprd3-0-9:27279] [ 2] 
>>> /hpc/apps/mpi/openmpi/1.8.2/lib/libmpi.so.1(ompi_free_list_grow+0x239)[0x7fab724440c9]
>>> [csclprd3-0-9:27279] [ 3] 
>>> /hpc/apps/mpi/openmpi/1.8.2/lib/libmpi.so.1(ompi_free_list_resize_mt+0x40)[0x7fab72444200]
>>> [csclprd3-0-9:27279] [ 4] 
>>> /hpc/apps/mpi/openmpi/1.8.2/lib/openmpi/mca_bml_r2.so(+0x138e)[0x7fab679f838e]
>>> [csclprd3-0-9:27279] [ 5] 
>>> /hpc/apps/mpi/openmpi/1.8.2/lib/openmpi/mca_pml_ob1.so(mca_pml_ob1_add_procs+0xd5)[0x7fab66b204e5]
>>> [csclprd3-0-9:27279] [ 6] 
>>> /hpc/apps/mpi/openmpi/1.8.2/lib/libmpi.so.1(ompi_mpi_init+0x8d6)[0x7fab7245ee26]
>>> [csclprd3-0-9:27279] [ 7] 
>>> /hpc/apps/mpi/openmpi/1.8.2/lib/libmpi.so.1(MPI_Init+0x170)[0x7fab7247de10]
>>> [csclprd3-0-9:27279] [ 8] /hpc/apps/benchmarks/hpl/xhpl[0x401571]
>>> [csclprd3-0-9:27279] [ 9] 
>>> /lib64/libc.so.6(__libc_start_main+0xfd)[0x7fab7191ccdd]
>>> [csclprd3-0-9:27279] [10] /hpc/apps/benchmarks/hpl/xhpl[0x401439]
>>> [csclprd3-0-9:27279] *** End of error message ***
>>> [csclprd3-0-9:27280] [ 0] /lib64/libc.so.6(+0x32920)[0x7f0bc7a18920]
>>> [csclprd3-0-9:27280] [ 1] 
>>> /hpc/apps/mpi/openmpi/1.8.2/lib/openmpi/mca_btl_sm.so(+0x511a)[0x7f0bc163b11a]
>>> [csclprd3-0-9:27280] [ 2] 
>>> /hpc/apps/mpi/openmpi/1.8.2/lib/libmpi.so.1(ompi_free_list_grow+0x239)[0x7f0bc852c0c9]
>>> [csclprd3-0-9:27280] [ 3] 
>>> /hpc/apps/mpi/openmpi/1.8.2/lib/libmpi.so.1(ompi_free_list_resize_mt+0x40)[0x7f0bc852c200]
>>> [csclprd3-0-9:27280] [ 4] 
>>> /hpc/apps/mpi/openmpi/1.8.2/lib/openmpi/mca_bml_r2.so(+0x138e)[0x7f0bc1a4238e]
>>> [csclprd3-0-9:27280] [ 5] 
>>> /hpc/apps/mpi/openmpi/1.8.2/lib/openmpi/mca_pml_ob1.so(mca_pml_ob1_add_procs+0xd5)[0x7f0bc0b6a4e5]
>>> [csclprd3-0-9:27280] [ 6] 
>>> /hpc/apps/mpi/openmpi/1.8.2/lib/libmpi.so.1(ompi_mpi_init+0x8d6)[0x7f0bc8546e26]
>>> [csclprd3-0-9:27280] [ 7] 
>>> /hpc/apps/mpi/openmpi/1.8.2/lib/libmpi.so.1(MPI_Init+0x170)[0x7f0bc8565e10]
>>> [csclprd3-0-9:27280] [ 8] /hpc/apps/benchmarks/hpl/xhpl[0x401571]
>>> [csclprd3-0-9:27280] [ 9] 
>>> /lib64/libc.so.6(__libc_start_main+0xfd)[0x7f0bc7a04cdd]
>>> [csclprd3-0-9:27280] [10] /hpc/apps/benchmarks/hpl/xhpl[0x401439]
>>> [csclprd3-0-9:27280] *** End of error message ***
>>> [csclprd3-0-9:27281] [ 0] /lib64/libc.so.6(+0x32920)[0x7f49cf009920]
>>> [csclprd3-0-9:27281] [ 1] 
>>> /hpc/apps/mpi/openmpi/1.8.2/lib/openmpi/mca_btl_sm.so(+0x511a)[0x7f49c8d0911a]
>>> [csclprd3-0-9:27281] [ 2] 
>>> /hpc/apps/mpi/openmpi/1.8.2/lib/libmpi.so.1(ompi_free_list_grow+0x239)[0x7f49cfb1d0c9]
>>> [csclprd3-0-9:27281] [ 3] 
>>> /hpc/apps/mpi/openmpi/1.8.2/lib/libmpi.so.1(ompi_free_list_resize_mt+0x40)[0x7f49cfb1d200]
>>> [csclprd3-0-9:27281] [ 4] 
>>> /hpc/apps/mpi/openmpi/1.8.2/lib/openmpi/mca_bml_r2.so(+0x138e)[0x7f49c911038e]
>>> [csclprd3-0-9:27281] [ 5] 
>>> /hpc/apps/mpi/openmpi/1.8.2/lib/openmpi/mca_pml_ob1.so(mca_pml_ob1_add_procs+0xd5)[0x7f49c82384e5]
>>> [csclprd3-0-9:27281] [ 6] 
>>> /hpc/apps/mpi/openmpi/1.8.2/lib/libmpi.so.1(ompi_mpi_init+0x8d6)[0x7f49cfb37e26]
>>> [csclprd3-0-9:27281] [ 7] 
>>> /hpc/apps/mpi/openmpi/1.8.2/lib/libmpi.so.1(MPI_Init+0x170)[0x7f49cfb56e10]
>>> [csclprd3-0-9:27281] [ 8] /hpc/apps/benchmarks/hpl/xhpl[0x401571]
>>> [csclprd3-0-9:27281] [ 9] 
>>> /lib64/libc.so.6(__libc_start_main+0xfd)[0x7f49ceff5cdd]
>>> [csclprd3-0-9:27281] [10] /hpc/apps/benchmarks/hpl/xhpl[0x401439]
>>> [csclprd3-0-9:27281] *** End of error message ***
>>> [csclprd3-0-9:27282] [ 0] /lib64/libc.so.6(+0x32920)[0x7fe65bb89920]
>>> [csclprd3-0-9:27282] [ 1] 
>>> /hpc/apps/mpi/openmpi/1.8.2/lib/openmpi/mca_btl_sm.so(+0x511a)[0x7fe6557b711a]
>>> [csclprd3-0-9:27282] [ 2] 
>>> /hpc/apps/mpi/openmpi/1.8.2/lib/libmpi.so.1(ompi_free_list_grow+0x239)[0x7fe65c69d0c9]
>>> [csclprd3-0-9:27282] [ 3] 
>>> /hpc/apps/mpi/openmpi/1.8.2/lib/libmpi.so.1(ompi_free_list_resize_mt+0x40)[0x7fe65c69d200]
>>> [csclprd3-0-9:27282] [ 4] 
>>> /hpc/apps/mpi/openmpi/1.8.2/lib/openmpi/mca_bml_r2.so(+0x138e)[0x7fe655bbe38e]
>>> [csclprd3-0-9:27282] [ 5] 
>>> /hpc/apps/mpi/openmpi/1.8.2/lib/openmpi/mca_pml_ob1.so(mca_pml_ob1_add_procs+0xd5)[0x7fe654ce64e5]
>>> [csclprd3-0-9:27282] [ 6] 
>>> /hpc/apps/mpi/openmpi/1.8.2/lib/libmpi.so.1(ompi_mpi_init+0x8d6)[0x7fe65c6b7e26]
>>> [csclprd3-0-9:27282] [ 7] 
>>> /hpc/apps/mpi/openmpi/1.8.2/lib/libmpi.so.1(MPI_Init+0x170)[0x7fe65c6d6e10]
>>> [csclprd3-0-9:27282] [ 8] /hpc/apps/benchmarks/hpl/xhpl[0x401571]
>>> [csclprd3-0-9:27282] [ 9] 
>>> /lib64/libc.so.6(__libc_start_main+0xfd)[0x7fe65bb75cdd]
>>> [csclprd3-0-9:27282] [10] /hpc/apps/benchmarks/hpl/xhpl[0x401439]
>>> [csclprd3-0-9:27282] *** End of error message ***
>>> [csclprd3-0-9:27283] [ 0] /lib64/libc.so.6(+0x32920)[0x7f79bd430920]
>>> [csclprd3-0-9:27283] [ 1] 
>>> /hpc/apps/mpi/openmpi/1.8.2/lib/openmpi/mca_btl_sm.so(+0x511a)[0x7f79b31e911a]
>>> [csclprd3-0-9:27283] [ 2] 
>>> /hpc/apps/mpi/openmpi/1.8.2/lib/libmpi.so.1(ompi_free_list_grow+0x239)[0x7f79bdf440c9]
>>> [csclprd3-0-9:27283] [ 3] 
>>> /hpc/apps/mpi/openmpi/1.8.2/lib/libmpi.so.1(ompi_free_list_resize_mt+0x40)[0x7f79bdf44200]
>>> [csclprd3-0-9:27283] [ 4] 
>>> /hpc/apps/mpi/openmpi/1.8.2/lib/openmpi/mca_bml_r2.so(+0x138e)[0x7f79b35f038e]
>>> [csclprd3-0-9:27283] [ 5] 
>>> /hpc/apps/mpi/openmpi/1.8.2/lib/openmpi/mca_pml_ob1.so(mca_pml_ob1_add_procs+0xd5)[0x7f79b27184e5]
>>> [csclprd3-0-9:27283] [ 6] 
>>> /hpc/apps/mpi/openmpi/1.8.2/lib/libmpi.so.1(ompi_mpi_init+0x8d6)[0x7f79bdf5ee26]
>>> [csclprd3-0-9:27283] [ 7] 
>>> /hpc/apps/mpi/openmpi/1.8.2/lib/libmpi.so.1(MPI_Init+0x170)[0x7f79bdf7de10]
>>> [csclprd3-0-9:27283] [ 8] /hpc/apps/benchmarks/hpl/xhpl[0x401571]
>>> [csclprd3-0-9:27283] [ 9] 
>>> /lib64/libc.so.6(__libc_start_main+0xfd)[0x7f79bd41ccdd]
>>> [csclprd3-0-9:27283] [10] /hpc/apps/benchmarks/hpl/xhpl[0x401439]
>>> [csclprd3-0-9:27283] *** End of error message ***
>>> /lib64/libc.so.6(+0x32920)[0x7f83c367f920]
>>> [csclprd3-0-9:27274] [ 1] 
>>> /hpc/apps/mpi/openmpi/1.8.2/lib/openmpi/mca_btl_sm.so(+0x511a)[0x7f83bd2f211a]
>>> [csclprd3-0-9:27274] [ 2] 
>>> /hpc/apps/mpi/openmpi/1.8.2/lib/libmpi.so.1(ompi_free_list_grow+0x239)[0x7f83c41930c9]
>>> [csclprd3-0-9:27274] [ 3] 
>>> /hpc/apps/mpi/openmpi/1.8.2/lib/libmpi.so.1(ompi_free_list_resize_mt+0x40)[0x7f83c4193200]
>>> [csclprd3-0-9:27274] [ 4] 
>>> /hpc/apps/mpi/openmpi/1.8.2/lib/openmpi/mca_bml_r2.so(+0x138e)[0x7f83bd6f938e]
>>> [csclprd3-0-9:27274] [ 5] 
>>> /hpc/apps/mpi/openmpi/1.8.2/lib/openmpi/mca_pml_ob1.so(mca_pml_ob1_add_procs+0xd5)[0x7f83bc8214e5]
>>> [csclprd3-0-9:27274] [ 6] 
>>> /hpc/apps/mpi/openmpi/1.8.2/lib/libmpi.so.1(ompi_mpi_init+0x8d6)[0x7f83c41ade26]
>>> [csclprd3-0-9:27274] [ 7] 
>>> /hpc/apps/mpi/openmpi/1.8.2/lib/libmpi.so.1(MPI_Init+0x170)[0x7f83c41cce10]
>>> [csclprd3-0-9:27274] [ 8] /hpc/apps/benchmarks/hpl/xhpl[0x401571]
>>> [csclprd3-0-9:27274] [ 9] 
>>> /lib64/libc.so.6(__libc_start_main+0xfd)[0x7f83c366bcdd]
>>> [csclprd3-0-9:27274] [10] /hpc/apps/benchmarks/hpl/xhpl[0x401439]
>>> [csclprd3-0-9:27274] *** End of error message ***
>>> [csclprd3-0-9:27284] *** Process received signal ***
>>> [csclprd3-0-9:27284] Signal: Bus error (7)
>>> [csclprd3-0-9:27284] Signal code: Non-existant physical address (2)
>>> [csclprd3-0-9:27284] Failing at address: 0x7f7273480e80
>>> [csclprd3-0-9:27284] [ 0] /lib64/libc.so.6(+0x32920)[0x7f7287160920]
>>> [csclprd3-0-9:27284] [ 1] 
>>> /hpc/apps/mpi/openmpi/1.8.2/lib/openmpi/mca_btl_sm.so(+0x511a)[0x7f7280ebf11a]
>>> [csclprd3-0-9:27284] [ 2] 
>>> /hpc/apps/mpi/openmpi/1.8.2/lib/libmpi.so.1(ompi_free_list_grow+0x239)[0x7f7287c740c9]
>>> [csclprd3-0-9:27284] [ 3] 
>>> /hpc/apps/mpi/openmpi/1.8.2/lib/libmpi.so.1(ompi_free_list_resize_mt+0x40)[0x7f7287c74200]
>>> [csclprd3-0-9:27284] [ 4] 
>>> /hpc/apps/mpi/openmpi/1.8.2/lib/openmpi/mca_bml_r2.so(+0x138e)[0x7f72812c638e]
>>> [csclprd3-0-9:27284] [ 5] 
>>> /hpc/apps/mpi/openmpi/1.8.2/lib/openmpi/mca_pml_ob1.so(mca_pml_ob1_add_procs+0xd5)[0x7f72803ee4e5]
>>> [csclprd3-0-9:27284] [ 6] 
>>> /hpc/apps/mpi/openmpi/1.8.2/lib/libmpi.so.1(ompi_mpi_init+0x8d6)[0x7f7287c8ee26]
>>> [csclprd3-0-9:27284] [ 7] 
>>> /hpc/apps/mpi/openmpi/1.8.2/lib/libmpi.so.1(MPI_Init+0x170)[0x7f7287cade10]
>>> [csclprd3-0-9:27284] [ 8] /hpc/apps/benchmarks/hpl/xhpl[0x401571]
>>> [csclprd3-0-9:27284] [ 9] 
>>> /lib64/libc.so.6(__libc_start_main+0xfd)[0x7f728714ccdd]
>>> [csclprd3-0-9:27284] [10] /hpc/apps/benchmarks/hpl/xhpl[0x401439]
>>> [csclprd3-0-9:27284] *** End of error message ***
>>> [csclprd3-0-9:27272] [ 0] /lib64/libc.so.6(+0x32920)[0x7f646a415920]
>>> [csclprd3-0-9:27272] [ 1] 
>>> /hpc/apps/mpi/openmpi/1.8.2/lib/openmpi/mca_btl_sm.so(+0x511a)[0x7f64641b911a]
>>> [csclprd3-0-9:27272] [ 2] 
>>> /hpc/apps/mpi/openmpi/1.8.2/lib/libmpi.so.1(ompi_free_list_grow+0x239)[0x7f646af290c9]
>>> [csclprd3-0-9:27272] [ 3] 
>>> /hpc/apps/mpi/openmpi/1.8.2/lib/libmpi.so.1(ompi_free_list_resize_mt+0x40)[0x7f646af29200]
>>> [csclprd3-0-9:27272] [ 4] 
>>> /hpc/apps/mpi/openmpi/1.8.2/lib/openmpi/mca_bml_r2.so(+0x138e)[0x7f64645c038e]
>>> [csclprd3-0-9:27272] [ 5] 
>>> /hpc/apps/mpi/openmpi/1.8.2/lib/openmpi/mca_pml_ob1.so(mca_pml_ob1_add_procs+0xd5)[0x7f645f5344e5]
>>> [csclprd3-0-9:27272] [ 6] 
>>> /hpc/apps/mpi/openmpi/1.8.2/lib/libmpi.so.1(ompi_mpi_init+0x8d6)[0x7f646af43e26]
>>> [csclprd3-0-9:27272] [ 7] 
>>> /hpc/apps/mpi/openmpi/1.8.2/lib/libmpi.so.1(MPI_Init+0x170)[0x7f646af62e10]
>>> [csclprd3-0-9:27272] [ 8] /hpc/apps/benchmarks/hpl/xhpl[0x401571]
>>> [csclprd3-0-9:27272] [ 9] 
>>> /lib64/libc.so.6(__libc_start_main+0xfd)[0x7f646a401cdd]
>>> [csclprd3-0-9:27272] [10] /hpc/apps/benchmarks/hpl/xhpl[0x401439]
>>> [csclprd3-0-9:27272] *** End of error message ***
>>> --------------------------------------------------------------------------
>>> mpirun noticed that process rank 88 with PID 27269 on node csclprd3-0-9 
>>> exited on signal 7 (Bus error).
>>> --------------------------------------------------------------------------
>>> 16 total processes killed (some possibly by mpirun during cleanup)
>>> 
>>> c. hostfile
>>> csclprd3-6-1 slots=4 max-slots=4
>>> csclprd3-6-5 slots=4 max-slots=4
>>> csclprd3-0-0 slots=12 max-slots=24
>>> csclprd3-0-1 slots=6 max-slots=12
>>> csclprd3-0-2 slots=6 max-slots=12
>>> csclprd3-0-3 slots=6 max-slots=12
>>> csclprd3-0-4 slots=6 max-slots=12
>>> csclprd3-0-5 slots=6 max-slots=12
>>> csclprd3-0-6 slots=6 max-slots=12
>>> #total number of successfully tested non-hyperthreaded computes slots at 
>>> this point is 56
>>> csclprd3-0-7 slots=16 max-slots=32
>>> #total number of successfully tested slots at this point is 72
>>> csclprd3-0-8 slots=16 max-slots=32
>>> #total number of successfully tested slots at this point is 88
>>> csclprd3-0-9 slots=16 max-slots=32
>>> #total number of slots at this point is 104
>>> #csclprd3-0-10 slots=16 max-slots=32
>>> #csclprd3-0-11 slots=16 max-slots=32
>>> #total number of slots at this point is 136
>>> 
>>> 
>>> From: users [users-boun...@open-mpi.org 
>>> <mailto:users-boun...@open-mpi.org>] on behalf of Ralph Castain 
>>> [r...@open-mpi.org <mailto:r...@open-mpi.org>]
>>> Sent: Wednesday, April 08, 2015 11:31 AM
>>> To: Open MPI Users
>>> Subject: Re: [OMPI users] OpenMPI 1.8.2 problems on CentOS 6.3
>>> 
>>> Just for clarity: does the BIOS on the LGA2011 system have HT enabled?
>>> 
>>>> On Apr 8, 2015, at 10:55 AM, Lane, William <william.l...@cshs.org 
>>>> <mailto:william.l...@cshs.org>> wrote:
>>>> 
>>>> Ralph,
>>>> 
>>>> I added one of the newer LGA2011 nodes to my hostfile and
>>>> re-ran the benchmark successfully and saw some strange results WRT the
>>>> binding directives. Why are hyperthreading cores being used
>>>> on the LGA2011 system but not any of other systems which
>>>> are mostly hyperthreaded Westmeres)? Isn't the --use-hwthread-cpus
>>>> switch supposed to prevent OpenMPI from using hyperthreaded
>>>> cores?
>>>> 
>>>> OpenMPI LAPACK invocation:
>>>> 
>>>> $MPI_DIR/bin/mpirun -np $NSLOTS --report-bindings --hostfile 
>>>> hostfile-single --mca btl_tcp_if_include eth0 --hetero-nodes 
>>>> --use-hwthread-cpus --prefix $MPI_DIR $BENCH_DIR/$APP_DIR/$APP_BIN
>>>> 
>>>> Where NSLOTS=72
>>>> 
>>>> hostfile:
>>>> csclprd3-6-1 slots=4 max-slots=4
>>>> csclprd3-6-5 slots=4 max-slots=4
>>>> csclprd3-0-0 slots=12 max-slots=24
>>>> csclprd3-0-1 slots=6 max-slots=12
>>>> csclprd3-0-2 slots=6 max-slots=12
>>>> csclprd3-0-3 slots=6 max-slots=12
>>>> csclprd3-0-4 slots=6 max-slots=12
>>>> csclprd3-0-5 slots=6 max-slots=12
>>>> csclprd3-0-6 slots=6 max-slots=12
>>>> #total number of successfully tested non-hyperthreaded computes slots at 
>>>> this point is 56
>>>> csclprd3-0-7 slots=16 max-slots=32
>>>> 
>>>> LGA1366 Westmere w/two Intel Xeon X5675 6-core/12-hyperthread CPU's
>>>> 
>>>> [csclprd3-0-0:11848] MCW rank 11 bound to socket 1[core 7[hwt 0]]: 
>>>> [./././././.][./B/./././.]
>>>> [csclprd3-0-0:11848] MCW rank 12 bound to socket 0[core 2[hwt 0]]: 
>>>> [././B/././.][./././././.]
>>>> [csclprd3-0-0:11848] MCW rank 13 bound to socket 1[core 8[hwt 0]]: 
>>>> [./././././.][././B/././.]
>>>> [csclprd3-0-0:11848] MCW rank 14 bound to socket 0[core 3[hwt 0]]: 
>>>> [./././B/./.][./././././.]
>>>> [csclprd3-0-0:11848] MCW rank 15 bound to socket 1[core 9[hwt 0]]: 
>>>> [./././././.][./././B/./.]
>>>> [csclprd3-0-0:11848] MCW rank 16 bound to socket 0[core 4[hwt 0]]: 
>>>> [././././B/.][./././././.]
>>>> [csclprd3-0-0:11848] MCW rank 17 bound to socket 1[core 10[hwt 0]]: 
>>>> [./././././.][././././B/.]
>>>> [csclprd3-0-0:11848] MCW rank 18 bound to socket 0[core 5[hwt 0]]: 
>>>> [./././././B][./././././.]
>>>> [csclprd3-0-0:11848] MCW rank 19 bound to socket 1[core 11[hwt 0]]: 
>>>> [./././././.][./././././B]
>>>> [csclprd3-0-0:11848] MCW rank 8 bound to socket 0[core 0[hwt 0]]: 
>>>> [B/././././.][./././././.]
>>>> [csclprd3-0-0:11848] MCW rank 9 bound to socket 1[core 6[hwt 0]]: 
>>>> [./././././.][B/././././.]
>>>> [csclprd3-0-0:11848] MCW rank 10 bound to socket 0[core 1[hwt 0]]: 
>>>> [./B/./././.][./././././.]
>>>> 
>>>> but for the LGA2011 system w/two 8-core/16-hyperthread CPU's 
>>>> 
>>>> [csclprd3-0-7:30876] MCW rank 60 bound to socket 0[core 2[hwt 0-1]]: 
>>>> [../../BB/../../../../..][../../../../../../../..]
>>>> [csclprd3-0-7:30876] MCW rank 61 bound to socket 1[core 10[hwt 0-1]]: 
>>>> [../../../../../../../..][../../BB/../../../../..]
>>>> [csclprd3-0-7:30876] MCW rank 62 bound to socket 0[core 3[hwt 0-1]]: 
>>>> [../../../BB/../../../..][../../../../../../../..]
>>>> [csclprd3-0-7:30876] MCW rank 63 bound to socket 1[core 11[hwt 0-1]]: 
>>>> [../../../../../../../..][../../../BB/../../../..]
>>>> [csclprd3-0-7:30876] MCW rank 64 bound to socket 0[core 4[hwt 0-1]]: 
>>>> [../../../../BB/../../..][../../../../../../../..]
>>>> [csclprd3-0-7:30876] MCW rank 65 bound to socket 1[core 12[hwt 0-1]]: 
>>>> [../../../../../../../..][../../../../BB/../../..]
>>>> [csclprd3-0-7:30876] MCW rank 66 bound to socket 0[core 5[hwt 0-1]]: 
>>>> [../../../../../BB/../..][../../../../../../../..]
>>>> [csclprd3-0-7:30876] MCW rank 67 bound to socket 1[core 13[hwt 0-1]]: 
>>>> [../../../../../../../..][../../../../../BB/../..]
>>>> [csclprd3-0-7:30876] MCW rank 68 bound to socket 0[core 6[hwt 0-1]]: 
>>>> [../../../../../../BB/..][../../../../../../../..]
>>>> [csclprd3-0-7:30876] MCW rank 69 bound to socket 1[core 14[hwt 0-1]]: 
>>>> [../../../../../../../..][../../../../../../BB/..]
>>>> [csclprd3-0-7:30876] MCW rank 70 bound to socket 0[core 7[hwt 0-1]]: 
>>>> [../../../../../../../BB][../../../../../../../..]
>>>> [csclprd3-0-7:30876] MCW rank 71 bound to socket 1[core 15[hwt 0-1]]: 
>>>> [../../../../../../../..][../../../../../../../BB]
>>>> [csclprd3-0-7:30876] MCW rank 56 bound to socket 0[core 0[hwt 0-1]]: 
>>>> [BB/../../../../../../..][../../../../../../../..]
>>>> [csclprd3-0-7:30876] MCW rank 57 bound to socket 1[core 8[hwt 0-1]]: 
>>>> [../../../../../../../..][BB/../../../../../../..]
>>>> [csclprd3-0-7:30876] MCW rank 58 bound to socket 0[core 1[hwt 0-1]]: 
>>>> [../BB/../../../../../..][../../../../../../../..]
>>>> [csclprd3-0-7:30876] MCW rank 59 bound to socket 1[core 9[hwt 0-1]]: 
>>>> [../../../../../../../..][../BB/../../../../../..]
>>>> 
>>>> 
>>>> 
>>>> 
>>>> From: users [users-boun...@open-mpi.org 
>>>> <mailto:users-boun...@open-mpi.org>] on behalf of Ralph Castain 
>>>> [r...@open-mpi.org <mailto:r...@open-mpi.org>]
>>>> Sent: Wednesday, April 08, 2015 10:26 AM
>>>> To: Open MPI Users
>>>> Subject: Re: [OMPI users] OpenMPI 1.8.2 problems on CentOS 6.3
>>>> 
>>>> 
>>>>> On Apr 8, 2015, at 9:29 AM, Lane, William <william.l...@cshs.org 
>>>>> <mailto:william.l...@cshs.org>> wrote:
>>>>> 
>>>>> Ralph,
>>>>> 
>>>>> Thanks for YOUR help,  I never
>>>>> would've managed to get the LAPACK
>>>>> benchmark running on more than one
>>>>> node in our cluster without your help.
>>>>> 
>>>>> Ralph, is hyperthreading more of a curse
>>>>> than an advantage for HPC applications?
>>>> 
>>>> Wow, you’ll get a lot of argument over that issue! From what I can see, it 
>>>> is very application dependent. Some apps appear to benefit, while others 
>>>> can even suffer from it.
>>>> 
>>>> I think we should support a mix of nodes in this usage, so I’ll try to 
>>>> come up with a way to do so.
>>>> 
>>>>> 
>>>>> I'm going to go through all the OpenMPI 
>>>>> articles on hyperthreading and NUMA to
>>>>> see if that will shed any light on these
>>>>> issues.
>>>>> 
>>>>> -Bill L.
>>>>> 
>>>>> 
>>>>> From: users [users-boun...@open-mpi.org 
>>>>> <mailto:users-boun...@open-mpi.org>] on behalf of Ralph Castain 
>>>>> [r...@open-mpi.org <mailto:r...@open-mpi.org>]
>>>>> Sent: Tuesday, April 07, 2015 7:32 PM
>>>>> To: Open MPI Users
>>>>> Subject: Re: [OMPI users] OpenMPI 1.8.2 problems on CentOS 6.3
>>>>> 
>>>>> I’m not sure our man pages are good enough to answer your question, but 
>>>>> here is the URL
>>>>> 
>>>>> http://www.open-mpi.org/doc/v1.8/ <http://www.open-mpi.org/doc/v1.8/>
>>>>> 
>>>>> I’m a tad tied up right now, but I’ll try to address this prior to 1.8.5 
>>>>> release. Thanks for all that debug effort! Helps a bunch.
>>>>> 
>>>>>> On Apr 7, 2015, at 1:17 PM, Lane, William <william.l...@cshs.org 
>>>>>> <mailto:william.l...@cshs.org>> wrote:
>>>>>> 
>>>>>> Ralph,
>>>>>> 
>>>>>> I've finally had some luck using the following:
>>>>>> $MPI_DIR/bin/mpirun -np $NSLOTS --report-bindings --hostfile 
>>>>>> hostfile-single --mca btl_tcp_if_include eth0 --hetero-nodes 
>>>>>> --use-hwthread-cpus --prefix $MPI_DIR $BENCH_DIR/$APP_DIR/$APP_BIN
>>>>>> 
>>>>>> Where $NSLOTS was 56 and my hostfile hostfile-single is:
>>>>>> 
>>>>>> csclprd3-0-0 slots=12 max-slots=24
>>>>>> csclprd3-0-1 slots=6 max-slots=12
>>>>>> csclprd3-0-2 slots=6 max-slots=12
>>>>>> csclprd3-0-3 slots=6 max-slots=12
>>>>>> csclprd3-0-4 slots=6 max-slots=12
>>>>>> csclprd3-0-5 slots=6 max-slots=12
>>>>>> csclprd3-0-6 slots=6 max-slots=12
>>>>>> csclprd3-6-1 slots=4 max-slots=4
>>>>>> csclprd3-6-5 slots=4 max-slots=4
>>>>>> 
>>>>>> The max-slots differs from slots on some nodes
>>>>>> because I include the hyperthreaded cores in
>>>>>> the max-slots, the last two nodes have CPU's that
>>>>>> don't support hyperthreading at all.
>>>>>> 
>>>>>> Does --use-hwthread-cpus prevent slots from
>>>>>> being assigned to hyperthreading cores?
>>>>>> 
>>>>>> For some reason the manpage for OpenMPI 1.8.2
>>>>>> isn't installed on our CentOS 6.3 systems is there a
>>>>>> URL I can I find a copy of the manpages for OpenMPI 1.8.2?
>>>>>> 
>>>>>> Thanks for your help,
>>>>>> 
>>>>>> -Bill Lane
>>>>>> 
>>>>>> From: users [users-boun...@open-mpi.org 
>>>>>> <mailto:users-boun...@open-mpi.org>] on behalf of Ralph Castain 
>>>>>> [r...@open-mpi.org <mailto:r...@open-mpi.org>]
>>>>>> Sent: Monday, April 06, 2015 1:39 PM
>>>>>> To: Open MPI Users
>>>>>> Subject: Re: [OMPI users] OpenMPI 1.8.2 problems on CentOS 6.3
>>>>>> 
>>>>>> Hmmm…well, that shouldn’t be the issue. To check, try running it with 
>>>>>> “bind-to none”. If you can get a backtrace telling us where it is 
>>>>>> crashing, that would also help.
>>>>>> 
>>>>>> 
>>>>>>> On Apr 6, 2015, at 12:24 PM, Lane, William <william.l...@cshs.org 
>>>>>>> <mailto:william.l...@cshs.org>> wrote:
>>>>>>> 
>>>>>>> Ralph,
>>>>>>> 
>>>>>>> For the following two different commandline invocations of the LAPACK 
>>>>>>> benchmark
>>>>>>> 
>>>>>>> $MPI_DIR/bin/mpirun -np $NSLOTS --report-bindings --hostfile 
>>>>>>> hostfile-no_slots --mca btl_tcp_if_include eth0 --hetero-nodes 
>>>>>>> --use-hwthread-cpus --bind-to hwthread --prefix $MPI_DIR 
>>>>>>> $BENCH_DIR/$APP_DIR/$APP_BIN
>>>>>>> 
>>>>>>> $MPI_DIR/bin/mpirun -np $NSLOTS --report-bindings --hostfile 
>>>>>>> hostfile-no_slots --mca btl_tcp_if_include eth0 --hetero-nodes 
>>>>>>> --bind-to-core --prefix $MPI_DIR $BENCH_DIR/$APP_DIR/$APP_BIN
>>>>>>> 
>>>>>>> I'm receiving the same kinds of OpenMPI error messages (but for 
>>>>>>> different nodes in the ring):
>>>>>>> 
>>>>>>>         [csclprd3-0-16:25940] *** Process received signal ***
>>>>>>>         [csclprd3-0-16:25940] Signal: Bus error (7)
>>>>>>>         [csclprd3-0-16:25940] Signal code: Non-existant physical 
>>>>>>> address (2)
>>>>>>>         [csclprd3-0-16:25940] Failing at address: 0x7f8b1b5a2600
>>>>>>> 
>>>>>>>         
>>>>>>> --------------------------------------------------------------------------
>>>>>>>         mpirun noticed that process rank 82 with PID 25936 on node 
>>>>>>> csclprd3-0-16 exited on signal 7 (Bus error).
>>>>>>>         
>>>>>>> --------------------------------------------------------------------------
>>>>>>>         16 total processes killed (some possibly by mpirun during 
>>>>>>> cleanup)
>>>>>>> 
>>>>>>> It seems to occur on systems that have more than one, physical CPU 
>>>>>>> installed. Could
>>>>>>> this be due to a lack of the correct NUMA libraries being installed?
>>>>>>> 
>>>>>>> -Bill L.
>>>>>>> 
>>>>>>> From: users [users-boun...@open-mpi.org 
>>>>>>> <mailto:users-boun...@open-mpi.org>] on behalf of Ralph Castain 
>>>>>>> [r...@open-mpi.org <mailto:r...@open-mpi.org>]
>>>>>>> Sent: Sunday, April 05, 2015 6:09 PM
>>>>>>> To: Open MPI Users
>>>>>>> Subject: Re: [OMPI users] OpenMPI 1.8.2 problems on CentOS 6.3
>>>>>>> 
>>>>>>> 
>>>>>>>> On Apr 5, 2015, at 5:58 PM, Lane, William <william.l...@cshs.org 
>>>>>>>> <mailto:william.l...@cshs.org>> wrote:
>>>>>>>> 
>>>>>>>> I think some of the Intel Blade systems in the cluster are
>>>>>>>> dual core, but don't support hyperthreading. Maybe it
>>>>>>>> would be better to exclude hyperthreading altogether
>>>>>>>> from submitted OpenMPI jobs?
>>>>>>> 
>>>>>>> Yes - or you can add "--hetero-nodes -use-hwthread-cpus --bind-to 
>>>>>>> hwthread" to the cmd line. This tells mpirun that the nodes aren't all 
>>>>>>> the same, and so it has to look at each node's topology instead of 
>>>>>>> taking the first node as the template for everything. The second tells 
>>>>>>> it to use the HTs as independent cpus where they are supported.
>>>>>>> 
>>>>>>> I'm not entirely sure the suggestion will work - if we hit a place 
>>>>>>> where HT isn't supported, we may balk at being asked to bind to HTs. I 
>>>>>>> can probably make a change that supports this kind of hetero 
>>>>>>> arrangement (perhaps something like bind-to pu) - might make it into 
>>>>>>> 1.8.5 (we are just starting the release process on it now).
>>>>>>> 
>>>>>>>> 
>>>>>>>> OpenMPI doesn't crash, but it doesn't run the LAPACK
>>>>>>>> benchmark either.
>>>>>>>> 
>>>>>>>> Thanks again Ralph.
>>>>>>>> 
>>>>>>>> Bill L.
>>>>>>>> 
>>>>>>>> From: users [users-boun...@open-mpi.org 
>>>>>>>> <mailto:users-boun...@open-mpi.org>] on behalf of Ralph Castain 
>>>>>>>> [r...@open-mpi.org <mailto:r...@open-mpi.org>]
>>>>>>>> Sent: Wednesday, April 01, 2015 8:40 AM
>>>>>>>> To: Open MPI Users
>>>>>>>> Subject: Re: [OMPI users] OpenMPI 1.8.2 problems on CentOS 6.3
>>>>>>>> 
>>>>>>>> Bingo - you said the magic word. This is a terminology issue. When we 
>>>>>>>> say "core", we mean the old definition of "core", not "hyperthreads". 
>>>>>>>> If you want to use HTs as your base processing unit and bind to them, 
>>>>>>>> then you need to specify --bind-to hwthread. That warning should then 
>>>>>>>> go away.
>>>>>>>> 
>>>>>>>> We don't require a swap region be mounted - I didn't see anything in 
>>>>>>>> your original message indicating that OMPI had actually crashed, but 
>>>>>>>> just wasn't launching due to the above issue. Were you actually seeing 
>>>>>>>> crashes as well?
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On Wed, Apr 1, 2015 at 8:31 AM, Lane, William <william.l...@cshs.org 
>>>>>>>> <mailto:william.l...@cshs.org>> wrote:
>>>>>>>> Ralph,
>>>>>>>> 
>>>>>>>> Here's the associated hostfile:
>>>>>>>> 
>>>>>>>> #openMPI hostfile for csclprd3
>>>>>>>> #max slots prevents oversubscribing csclprd3-0-9
>>>>>>>> csclprd3-0-0 slots=12 max-slots=12
>>>>>>>> csclprd3-0-1 slots=6 max-slots=6
>>>>>>>> csclprd3-0-2 slots=6 max-slots=6
>>>>>>>> csclprd3-0-3 slots=6 max-slots=6
>>>>>>>> csclprd3-0-4 slots=6 max-slots=6
>>>>>>>> csclprd3-0-5 slots=6 max-slots=6
>>>>>>>> csclprd3-0-6 slots=6 max-slots=6
>>>>>>>> csclprd3-0-7 slots=32 max-slots=32
>>>>>>>> csclprd3-0-8 slots=32 max-slots=32
>>>>>>>> csclprd3-0-9 slots=32 max-slots=32
>>>>>>>> csclprd3-0-10 slots=32 max-slots=32
>>>>>>>> csclprd3-0-11 slots=32 max-slots=32
>>>>>>>> csclprd3-0-12 slots=12 max-slots=12
>>>>>>>> csclprd3-0-13 slots=24 max-slots=24
>>>>>>>> csclprd3-0-14 slots=16 max-slots=16
>>>>>>>> csclprd3-0-15 slots=16 max-slots=16
>>>>>>>> csclprd3-0-16 slots=24 max-slots=24
>>>>>>>> csclprd3-0-17 slots=24 max-slots=24
>>>>>>>> csclprd3-6-1 slots=4 max-slots=4
>>>>>>>> csclprd3-6-5 slots=4 max-slots=4
>>>>>>>> 
>>>>>>>> The number of slots also includes hyperthreading
>>>>>>>> cores.
>>>>>>>> 
>>>>>>>> One more question, would not having defined swap
>>>>>>>> partitions on all the nodes in the ring cause OpenMPI
>>>>>>>> to crash? Because no swap partitions are defined
>>>>>>>> for any of the above systems.
>>>>>>>> 
>>>>>>>> -Bill L.
>>>>>>>> 
>>>>>>>> 
>>>>>>>> From: users [users-boun...@open-mpi.org 
>>>>>>>> <mailto:users-boun...@open-mpi.org>] on behalf of Ralph Castain 
>>>>>>>> [r...@open-mpi.org <mailto:r...@open-mpi.org>]
>>>>>>>> Sent: Wednesday, April 01, 2015 5:04 AM
>>>>>>>> To: Open MPI Users
>>>>>>>> Subject: Re: [OMPI users] OpenMPI 1.8.2 problems on CentOS 6.3
>>>>>>>> 
>>>>>>>> The warning about binding to memory is due to not having numactl-devel 
>>>>>>>> installed on the system. The job would still run, but we are warning 
>>>>>>>> you that we cannot bind memory to the same domain as the core where we 
>>>>>>>> bind the process. Can cause poor performance, but not fatal. I forget 
>>>>>>>> the name of the param, but you can tell us to "shut up" :-)
>>>>>>>> 
>>>>>>>> The other warning/error indicates that we aren't seeing enough cores 
>>>>>>>> on the allocation you gave us via the hostile to support one proc/core 
>>>>>>>> - i.e., we didn't at least 128 cores in the sum of the nodes you told 
>>>>>>>> us about. I take it you were expecting that there were that many or 
>>>>>>>> more?
>>>>>>>> 
>>>>>>>> Ralph
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On Wed, Apr 1, 2015 at 12:54 AM, Lane, William <william.l...@cshs.org 
>>>>>>>> <mailto:william.l...@cshs.org>> wrote:
>>>>>>>> I'm having problems running OpenMPI jobs
>>>>>>>> (using a hostfile) on an HPC cluster running
>>>>>>>> ROCKS on CentOS 6.3. I'm running OpenMPI
>>>>>>>> outside of Sun Grid Engine (i.e. it is not submitted
>>>>>>>> as a job to SGE). The program being run is a LAPACK
>>>>>>>> benchmark. The commandline parameter I'm 
>>>>>>>> using to run the jobs is:
>>>>>>>> 
>>>>>>>> $MPI_DIR/bin/mpirun -np $NSLOTS -bind-to-core -report-bindings 
>>>>>>>> --hostfile hostfile --mca btl_tcp_if_include eth0 --prefix $MPI_DIR 
>>>>>>>> $BENCH_DIR/$APP_DIR/$APP_BIN
>>>>>>>> 
>>>>>>>> Where MPI_DIR=/hpc/apps/mpi/openmpi/1.8.2/
>>>>>>>> NSLOTS=128
>>>>>>>> 
>>>>>>>> I'm getting errors of the form and OpenMPI never runs the LAPACK 
>>>>>>>> benchmark:
>>>>>>>> 
>>>>>>>>    
>>>>>>>> --------------------------------------------------------------------------
>>>>>>>>    WARNING: a request was made to bind a process. While the system
>>>>>>>>    supports binding the process itself, at least one node does NOT
>>>>>>>>    support binding memory to the process location.
>>>>>>>> 
>>>>>>>>     Node:  csclprd3-0-11
>>>>>>>> 
>>>>>>>>    This usually is due to not having the required NUMA support 
>>>>>>>> installed
>>>>>>>>    on the node. In some Linux distributions, the required support is
>>>>>>>>    contained in the libnumactl and libnumactl-devel packages.
>>>>>>>>    This is a warning only; your job will continue, though performance 
>>>>>>>> may be degraded.
>>>>>>>>    
>>>>>>>> --------------------------------------------------------------------------
>>>>>>>> 
>>>>>>>>    
>>>>>>>> --------------------------------------------------------------------------
>>>>>>>>    A request was made to bind to that would result in binding more
>>>>>>>>    processes than cpus on a resource:
>>>>>>>> 
>>>>>>>>       Bind to:     CORE
>>>>>>>>       Node:        csclprd3-0-11
>>>>>>>>       #processes:  2
>>>>>>>>       #cpus:       1
>>>>>>>> 
>>>>>>>>    You can override this protection by adding the "overload-allowed"
>>>>>>>>    option to your binding directive.
>>>>>>>>    
>>>>>>>> --------------------------------------------------------------------------
>>>>>>>> 
>>>>>>>> The only installed numa packages are:
>>>>>>>> numactl.x86_64                                                
>>>>>>>> 2.0.7-3.el6                        @centos6.3-x86_64-0/$
>>>>>>>> 
>>>>>>>> When I search for the available NUMA packages I find:
>>>>>>>> 
>>>>>>>> yum search numa | less
>>>>>>>> 
>>>>>>>>         Loaded plugins: fastestmirror
>>>>>>>>         Loading mirror speeds from cached hostfile
>>>>>>>>         ============================== N/S Matched: numa 
>>>>>>>> ===============================
>>>>>>>>         numactl-devel.i686 : Development package for building 
>>>>>>>> Applications that use numa
>>>>>>>>         numactl-devel.x86_64 : Development package for building 
>>>>>>>> Applications that use
>>>>>>>>                              : numa
>>>>>>>>         numad.x86_64 : NUMA user daemon
>>>>>>>>         numactl.i686 : Library for tuning for Non Uniform Memory 
>>>>>>>> Access machines
>>>>>>>>         numactl.x86_64 : Library for tuning for Non Uniform Memory 
>>>>>>>> Access machines
>>>>>>>> 
>>>>>>>> Do I need to install additional and/or different NUMA packages in 
>>>>>>>> order to get OpenMPI to work
>>>>>>>> on this cluster?
>>>>>>>> 
>>>>>>>> -Bill Lane
>>>>>>>> IMPORTANT WARNING: This message is intended for the use of the person 
>>>>>>>> or entity to which it is addressed and may contain information that is 
>>>>>>>> privileged and confidential, the disclosure of which is governed by 
>>>>>>>> applicable law. If the reader of this message is not the intended 
>>>>>>>> recipient, or the employee or agent responsible for delivering it to 
>>>>>>>> the intended recipient, you are hereby notified that any 
>>>>>>>> dissemination, distribution or copying of this information is strictly 
>>>>>>>> prohibited. Thank you for your cooperation.
>>>>>>>> 
>>>>>>>> _______________________________________________
>>>>>>>> users mailing list
>>>>>>>> us...@open-mpi.org <mailto:us...@open-mpi.org>
>>>>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users 
>>>>>>>> <http://www.open-mpi.org/mailman/listinfo.cgi/users>
>>>>>>>> Searchable archives: 
>>>>>>>> http://www.open-mpi.org/community/lists/users/2015/04/index.php 
>>>>>>>> <http://www.open-mpi.org/community/lists/users/2015/04/index.php>
>>>>>>>> 
>>>>>>>> IMPORTANT WARNING: This message is intended for the use of the person 
>>>>>>>> or entity to which it is addressed and may contain information that is 
>>>>>>>> privileged and confidential, the disclosure of which is governed by 
>>>>>>>> applicable law. If the reader of this message is not the intended 
>>>>>>>> recipient, or the employee or agent responsible for delivering it to 
>>>>>>>> the intended recipient, you are hereby notified that any 
>>>>>>>> dissemination, distribution or copying of this information is strictly 
>>>>>>>> prohibited. Thank you for your cooperation. 
>>>>>>>> 
>>>>>>>> _______________________________________________
>>>>>>>> users mailing list
>>>>>>>> us...@open-mpi.org <mailto:us...@open-mpi.org>
>>>>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users 
>>>>>>>> <http://www.open-mpi.org/mailman/listinfo.cgi/users>
>>>>>>>> Link to this post: 
>>>>>>>> http://www.open-mpi.org/community/lists/users/2015/04/26589.php 
>>>>>>>> <http://www.open-mpi.org/community/lists/users/2015/04/26589.php>
>>>>>>>> 
>>>>>>>> IMPORTANT WARNING: This message is intended for the use of the person 
>>>>>>>> or entity to which it is addressed and may contain information that is 
>>>>>>>> privileged and confidential, the disclosure of which is governed by 
>>>>>>>> applicable law. If the reader of this message is not the intended 
>>>>>>>> recipient, or the employee or agent responsible for delivering it to 
>>>>>>>> the intended recipient, you are hereby notified that any 
>>>>>>>> dissemination, distribution or copying of this information is strictly 
>>>>>>>> prohibited. Thank you for your cooperation. 
>>>>>>>> _______________________________________________
>>>>>>>> users mailing list
>>>>>>>> us...@open-mpi.org <mailto:us...@open-mpi.org>
>>>>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users 
>>>>>>>> <http://www.open-mpi.org/mailman/listinfo.cgi/users>
>>>>>>>> Link to this post: 
>>>>>>>> http://www.open-mpi.org/community/lists/users/2015/04/26611.php 
>>>>>>>> <http://www.open-mpi.org/community/lists/users/2015/04/26611.php>
>>>>>>> IMPORTANT WARNING: This message is intended for the use of the person 
>>>>>>> or entity to which it is addressed and may contain information that is 
>>>>>>> privileged and confidential, the disclosure of which is governed by 
>>>>>>> applicable law. If the reader of this message is not the intended 
>>>>>>> recipient, or the employee or agent responsible for delivering it to 
>>>>>>> the intended recipient, you are hereby notified that any dissemination, 
>>>>>>> distribution or copying of this information is strictly prohibited. 
>>>>>>> Thank you for your cooperation. 
>>>>>>> _______________________________________________
>>>>>>> users mailing list
>>>>>>> us...@open-mpi.org <mailto:us...@open-mpi.org>
>>>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users 
>>>>>>> <http://www.open-mpi.org/mailman/listinfo.cgi/users>
>>>>>>> Link to this post: 
>>>>>>> http://www.open-mpi.org/community/lists/users/2015/04/26618.php 
>>>>>>> <http://www.open-mpi.org/community/lists/users/2015/04/26618.php>
>>>>>> IMPORTANT WARNING: This message is intended for the use of the person or 
>>>>>> entity to which it is addressed and may contain information that is 
>>>>>> privileged and confidential, the disclosure of which is governed by 
>>>>>> applicable law. If the reader of this message is not the intended 
>>>>>> recipient, or the employee or agent responsible for delivering it to the 
>>>>>> intended recipient, you are hereby notified that any dissemination, 
>>>>>> distribution or copying of this information is strictly prohibited. 
>>>>>> Thank you for your cooperation. 
>>>>>> _______________________________________________
>>>>>> users mailing list
>>>>>> us...@open-mpi.org <mailto:us...@open-mpi.org>
>>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users 
>>>>>> <http://www.open-mpi.org/mailman/listinfo.cgi/users>
>>>>>> Link to this post: 
>>>>>> http://www.open-mpi.org/community/lists/users/2015/04/26643.php 
>>>>>> <http://www.open-mpi.org/community/lists/users/2015/04/26643.php>
>>>>> IMPORTANT WARNING: This message is intended for the use of the person or 
>>>>> entity to which it is addressed and may contain information that is 
>>>>> privileged and confidential, the disclosure of which is governed by 
>>>>> applicable law. If the reader of this message is not the intended 
>>>>> recipient, or the employee or agent responsible for delivering it to the 
>>>>> intended recipient, you are hereby notified that any dissemination, 
>>>>> distribution or copying of this information is strictly prohibited. Thank 
>>>>> you for your cooperation. _______________________________________________
>>>>> users mailing list
>>>>> us...@open-mpi.org <mailto:us...@open-mpi.org>
>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users 
>>>>> <http://www.open-mpi.org/mailman/listinfo.cgi/users>
>>>>> Link to this post: 
>>>>> http://www.open-mpi.org/community/lists/users/2015/04/26655.php 
>>>>> <http://www.open-mpi.org/community/lists/users/2015/04/26655.php>
>>>> IMPORTANT WARNING: This message is intended for the use of the person or 
>>>> entity to which it is addressed and may contain information that is 
>>>> privileged and confidential, the disclosure of which is governed by 
>>>> applicable law. If the reader of this message is not the intended 
>>>> recipient, or the employee or agent responsible for delivering it to the 
>>>> intended recipient, you are hereby notified that any dissemination, 
>>>> distribution or copying of this information is strictly prohibited. Thank 
>>>> you for your cooperation. _______________________________________________
>>>> users mailing list
>>>> us...@open-mpi.org <mailto:us...@open-mpi.org>
>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users 
>>>> <http://www.open-mpi.org/mailman/listinfo.cgi/users>
>>>> Link to this post: 
>>>> http://www.open-mpi.org/community/lists/users/2015/04/26659.php 
>>>> <http://www.open-mpi.org/community/lists/users/2015/04/26659.php>
>>> IMPORTANT WARNING: This message is intended for the use of the person or 
>>> entity to which it is addressed and may contain information that is 
>>> privileged and confidential, the disclosure of which is governed by 
>>> applicable law. If the reader of this message is not the intended 
>>> recipient, or the employee or agent responsible for delivering it to the 
>>> intended recipient, you are hereby notified that any dissemination, 
>>> distribution or copying of this information is strictly prohibited. Thank 
>>> you for your cooperation. _______________________________________________
>>> users mailing list
>>> us...@open-mpi.org <mailto:us...@open-mpi.org>
>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users 
>>> <http://www.open-mpi.org/mailman/listinfo.cgi/users>
>>> Link to this post: 
>>> http://www.open-mpi.org/community/lists/users/2015/04/26664.php 
>>> <http://www.open-mpi.org/community/lists/users/2015/04/26664.php>
> 
> IMPORTANT WARNING: This message is intended for the use of the person or 
> entity to which it is addressed and may contain information that is 
> privileged and confidential, the disclosure of which is governed by 
> applicable law. If the reader of this message is not the intended recipient, 
> or the employee or agent responsible for delivering it to the intended 
> recipient, you are hereby notified that any dissemination, distribution or 
> copying of this information is strictly prohibited. Thank you for your 
> cooperation. _______________________________________________
> users mailing list
> us...@open-mpi.org <mailto:us...@open-mpi.org>
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users 
> <http://www.open-mpi.org/mailman/listinfo.cgi/users>
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2015/04/26805.php 
> <http://www.open-mpi.org/community/lists/users/2015/04/26805.php>

Reply via email to