Hi, I am running on my SCG cluster the following script (using qsub):

#!/bin/bash
#$-cwd
#$ -S /bin/bash
#$ -V
#$ -q normal
#$ -pe mpi 40
#$ -P Lab219
#$ -o output
#$ -e error
module load PhyML/3.3
mpirun --mca pml yalla -np 40 phyml-mpi -i proteic -b 10 -d aa

where phyml-mpi is the parallel version for OMPI of the program PhyML. --mca pml yalla option is called to used MXM (I have mellanox OFED).


It gives me lots of errors related to KNEM (see error and output files from qsub in the attachments). However, I specified the KNEM directory when installing OMPI. I can't really understand such errors, and would appreciate any hint on this issue. I have run open-mpi on an own script (just a loop running inside something as: command --help) and got no error.


Thanks in advance

--------------------------------------------------------------------------
WARNING: Open MPI failed to open the /dev/knem device due to a local
error. Please check with your system administrator to get the problem
fixed, or set the btl_vader_single_copy_mechanism MCA variable to none
to silence this warning and run without knem support.

The vader shared memory BTL will fall back on another single-copy
mechanism if one is available. This may result in lower performance.

  Local host: NODE3
  Errno:      2 (No such file or directory)
--------------------------------------------------------------------------
--------------------------------------------------------------------------
WARNING: Linux kernel knem support was requested via the
btl_vader_single_copy_mechanism MCA parameter, but Knem support was either not
compiled into this Open MPI installation, or Knem support was unable
to be activated in this process.

The vader BTL will fall back on another single-copy mechanism if one
is available. This may result in lower performance.

  Local host: NODE3
--------------------------------------------------------------------------
[1484652493.596258] [NODE3:185606:0]      ib_dev.c:159  MXM  ERROR Failed to 
open uverbs0
[1484652493.596275] [NODE3:185604:0]      ib_dev.c:159  MXM  ERROR Failed to 
open uverbs0
[1484652493.596270] [NODE3:185607:0]      ib_dev.c:159  MXM  ERROR Failed to 
open uverbs0
[1484652493.596332] [NODE3:185605:0]      ib_dev.c:159  MXM  ERROR Failed to 
open uverbs0
[1484652493.596546] [NODE3:185608:0]      ib_dev.c:159  MXM  ERROR Failed to 
open uverbs0
[1484652493.597514] [NODE3:185610:0]      ib_dev.c:159  MXM  ERROR Failed to 
open uverbs0
[1484652493.599711] [NODE3:185613:0]      ib_dev.c:159  MXM  ERROR Failed to 
open uverbs0
[1484653451.955637] [NODE7:155953:0]      ib_dev.c:159  MXM  ERROR Failed to 
open uverbs0
[1484653451.955696] [NODE7:155953:0]         shm.c:69   MXM  WARN  Unable to 
close the KNEM device file
[1484652493.601480] [NODE3:185616:0]      ib_dev.c:159  MXM  ERROR Failed to 
open uverbs0
[1484653451.955739] [NODE7:155954:0]      ib_dev.c:159  MXM  ERROR Failed to 
open uverbs0
[1484653451.955783] [NODE7:155954:0]         shm.c:69   MXM  WARN  Unable to 
close the KNEM device file
[1484653451.958300] [NODE7:155955:0]      ib_dev.c:159  MXM  ERROR Failed to 
open uverbs0
[1484653451.960995] [NODE7:155956:0]      ib_dev.c:159  MXM  ERROR Failed to 
open uverbs0
[1484653451.961026] [NODE7:155956:0]         shm.c:69   MXM  WARN  Unable to 
close the KNEM device file
[1484653451.960948] [NODE7:155960:0]      ib_dev.c:159  MXM  ERROR Failed to 
open uverbs0
[1484653451.960979] [NODE7:155960:0]         shm.c:69   MXM  WARN  Unable to 
close the KNEM device file
[1484653451.961482] [NODE7:155957:0]      ib_dev.c:159  MXM  ERROR Failed to 
open uverbs0
[1484652512.164585] [NODE2:22129:0]      ib_dev.c:159  MXM  ERROR Failed to 
open uverbs0
[1484652512.165089] [NODE2:22130:0]      ib_dev.c:159  MXM  ERROR Failed to 
open uverbs0
[1484652512.168559] [NODE2:22131:0]      ib_dev.c:159  MXM  ERROR Failed to 
open uverbs0
[1484652512.169180] [NODE2:22132:0]      ib_dev.c:159  MXM  ERROR Failed to 
open uverbs0
[1484652512.172381] [NODE2:22134:0]      ib_dev.c:159  MXM  ERROR Failed to 
open uverbs0
[1484652512.173899] [NODE2:22137:0]      ib_dev.c:159  MXM  ERROR Failed to 
open uverbs0
[1484652512.176080] [NODE2:22140:0]      ib_dev.c:159  MXM  ERROR Failed to 
open uverbs0
[1484652512.177936] [NODE2:22144:0]      ib_dev.c:159  MXM  ERROR Failed to 
open uverbs0
[1484652512.181078] [NODE2:22146:0]      ib_dev.c:159  MXM  ERROR Failed to 
open uverbs0
[1484652512.182780] [NODE2:22150:0]      ib_dev.c:159  MXM  ERROR Failed to 
open uverbs0
[1484652512.183750] [NODE2:22154:0]      ib_dev.c:159  MXM  ERROR Failed to 
open uverbs0
Warning: Conflicting CPU frequencies detected, using: 2001.000000.
Warning: Conflicting CPU frequencies detected, using: 2001.000000.
Warning: Conflicting CPU frequencies detected, using: 2001.000000.
Warning: Conflicting CPU frequencies detected, using: 2001.000000.
MXM: Got signal 11 (Segmentation fault)
==== backtrace ====
MXM: Got signal 11 (Segmentation fault)
==== backtrace ====
MXM: Got signal 11 (Segmentation fault)
==== backtrace ====
Warning: Conflicting CPU frequencies detected, using: 2001.000000.
MXM: Got signal 11 (Segmentation fault)
==== backtrace ====
MXM: Got signal 11 (Segmentation fault)
==== backtrace ====
    0  /lib/x86_64-linux-gnu/libc.so.6(+0x360b0) [0x7f554717a0b0]
    1  /usr/lib/libibverbs.so.1(ibv_dealloc_pd+0x3) [0x7f553f5db0c3]
    2  /opt/mellanox/mxm/lib/libmxm.so.2(mxm_ib_init_devices+0x401) 
[0x7f553c9f90a1]
    3  /opt/mellanox/mxm/lib/libmxm.so.2(+0x158df) [0x7f553c9f88df]
    4  /opt/mellanox/mxm/lib/libmxm.so.2(mxm_components_init+0x43) 
[0x7f553c9fab63]
    5  /opt/mellanox/mxm/lib/libmxm.so.2(mxm_init+0x137) [0x7f553c9fda57]
    6  
/opt/openmpi-2.0.1/lib/openmpi/mca_pml_yalla.so(mca_pml_yalla_open+0x107) 
[0x7f553d714577]
    7  
/opt/openmpi-2.0.1/lib/libopen-pal.so.20(mca_base_framework_components_open+0xc5)
 [0x7f5546c2a3a5]
    8  /opt/openmpi-2.0.1/lib/libmpi.so.20(+0x9ee09) [0x7f55477bde09]
    9  /opt/openmpi-2.0.1/lib/libopen-pal.so.20(mca_base_framework_open+0x9d) 
[0x7f5546c3322d]
    0  /lib/x86_64-linux-gnu/libc.so.6(+0x360b0) [0x7f0201df10b0]
    1  /usr/lib/libibverbs.so.1(ibv_dealloc_pd+0x3) [0x7f01fe3900c3]
    2  /opt/mellanox/mxm/lib/libmxm.so.2(mxm_ib_init_devices+0x401) 
[0x7f01f77310a1]
    3  /opt/mellanox/mxm/lib/libmxm.so.2(+0x158df) [0x7f01f77308df]
    4  /opt/mellanox/mxm/lib/libmxm.so.2(mxm_components_init+0x43) 
[0x7f01f7732b63]
    5  /opt/mellanox/mxm/lib/libmxm.so.2(mxm_init+0x137) [0x7f01f7735a57]
    6  
/opt/openmpi-2.0.1/lib/openmpi/mca_pml_yalla.so(mca_pml_yalla_open+0x107) 
[0x7f01fc4c9577]
    7  
/opt/openmpi-2.0.1/lib/libopen-pal.so.20(mca_base_framework_components_open+0xc5)
 [0x7f02018a13a5]
    8  /opt/openmpi-2.0.1/lib/libmpi.so.20(+0x9ee09) [0x7f0202434e09]
    9  /opt/openmpi-2.0.1/lib/libopen-pal.so.20(mca_base_framework_open+0x9d) 
[0x7f02018aa22d]
   10  /opt/openmpi-2.0.1/lib/libmpi.so.20(ompi_mpi_init+0x4b6) [0x7f02023dae66]
   11  /opt/openmpi-2.0.1/lib/libmpi.so.20(MPI_Init+0x8b) [0x7f02023f9fbb]
   12  phyml-mpi() [0x401cf9]
   13  /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xed) [0x7f0201ddc7ed]
   14  phyml-mpi() [0x4028f9]
===================
    0  /lib/x86_64-linux-gnu/libc.so.6(+0x360b0) [0x7f13a17e60b0]
    1  /usr/lib/libibverbs.so.1(ibv_dealloc_pd+0x3) [0x7f139dd850c3]
    2  /opt/mellanox/mxm/lib/libmxm.so.2(mxm_ib_init_devices+0x401) 
[0x7f13970e00a1]
    3  /opt/mellanox/mxm/lib/libmxm.so.2(+0x158df) [0x7f13970df8df]
    4  /opt/mellanox/mxm/lib/libmxm.so.2(mxm_components_init+0x43) 
[0x7f13970e1b63]
    5  /opt/mellanox/mxm/lib/libmxm.so.2(mxm_init+0x137) [0x7f13970e4a57]
    6  
/opt/openmpi-2.0.1/lib/openmpi/mca_pml_yalla.so(mca_pml_yalla_open+0x107) 
[0x7f1397dfb577]
    7  
/opt/openmpi-2.0.1/lib/libopen-pal.so.20(mca_base_framework_components_open+0xc5)
 [0x7f13a12963a5]
    8  /opt/openmpi-2.0.1/lib/libmpi.so.20(+0x9ee09) [0x7f13a1e29e09]
    9  /opt/openmpi-2.0.1/lib/libopen-pal.so.20(mca_base_framework_open+0x9d) 
[0x7f13a129f22d]
   10  /opt/openmpi-2.0.1/lib/libmpi.so.20(ompi_mpi_init+0x4b6) [0x7f13a1dcfe66]
   11  /opt/openmpi-2.0.1/lib/libmpi.so.20(MPI_Init+0x8b) [0x7f13a1deefbb]
   12  phyml-mpi() [0x401cf9]
   13  /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xed) [0x7f13a17d17ed]
   14  phyml-mpi() [0x4028f9]
===================
    0  /lib/x86_64-linux-gnu/libc.so.6(+0x360b0) [0x7fc3a87530b0]
    1  /usr/lib/libibverbs.so.1(ibv_dealloc_pd+0) [0x7fc3a4cf20c0]
    2  /opt/mellanox/mxm/lib/libmxm.so.2(mxm_ib_init_devices+0x401) 
[0x7fc39e0bb0a1]
    3  /opt/mellanox/mxm/lib/libmxm.so.2(+0x158df) [0x7fc39e0ba8df]
    4  /opt/mellanox/mxm/lib/libmxm.so.2(mxm_components_init+0x43) 
[0x7fc39e0bcb63]
    5  /opt/mellanox/mxm/lib/libmxm.so.2(mxm_init+0x137) [0x7fc39e0bfa57]
    6  
/opt/openmpi-2.0.1/lib/openmpi/mca_pml_yalla.so(mca_pml_yalla_open+0x107) 
[0x7fc39edd6577]
    7  
/opt/openmpi-2.0.1/lib/libopen-pal.so.20(mca_base_framework_components_open+0xc5)
 [0x7fc3a82033a5]
    8  /opt/openmpi-2.0.1/lib/libmpi.so.20(+0x9ee09) [0x7fc3a8d96e09]
    9  /opt/openmpi-2.0.1/lib/libopen-pal.so.20(mca_base_framework_open+0x9d) 
[0x7fc3a820c22d]
   10  /opt/openmpi-2.0.1/lib/libmpi.so.20(ompi_mpi_init+0x4b6) [0x7fc3a8d3ce66]
   11  /opt/openmpi-2.0.1/lib/libmpi.so.20(MPI_Init+0x8b) [0x7fc3a8d5bfbb]
   12  phyml-mpi() [0x401cf9]
   13  /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xed) [0x7fc3a873e7ed]
   14  phyml-mpi() [0x4028f9]
===================
   10  /opt/openmpi-2.0.1/lib/libmpi.so.20(ompi_mpi_init+0x4b6) [0x7f5547763e66]
   11  /opt/openmpi-2.0.1/lib/libmpi.so.20(MPI_Init+0x8b) [0x7f5547782fbb]
   12  phyml-mpi() [0x401cf9]
   13  /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xed) [0x7f55471657ed]
   14  phyml-mpi() [0x4028f9]
===================
    0  /lib/x86_64-linux-gnu/libc.so.6(+0x360b0) [0x7f16aa05b0b0]
    1  /usr/lib/libibverbs.so.1(ibv_dealloc_pd+0x3) [0x7f16a65fa0c3]
    2  /opt/mellanox/mxm/lib/libmxm.so.2(mxm_ib_init_devices+0x401) 
[0x7f169f9c10a1]
    3  /opt/mellanox/mxm/lib/libmxm.so.2(+0x158df) [0x7f169f9c08df]
    4  /opt/mellanox/mxm/lib/libmxm.so.2(mxm_components_init+0x43) 
[0x7f169f9c2b63]
    5  /opt/mellanox/mxm/lib/libmxm.so.2(mxm_init+0x137) [0x7f169f9c5a57]
    6  
/opt/openmpi-2.0.1/lib/openmpi/mca_pml_yalla.so(mca_pml_yalla_open+0x107) 
[0x7f16a4733577]
    7  
/opt/openmpi-2.0.1/lib/libopen-pal.so.20(mca_base_framework_components_open+0xc5)
 [0x7f16a9b0b3a5]
    8  /opt/openmpi-2.0.1/lib/libmpi.so.20(+0x9ee09) [0x7f16aa69ee09]
    9  /opt/openmpi-2.0.1/lib/libopen-pal.so.20(mca_base_framework_open+0x9d) 
[0x7f16a9b1422d]
   10  /opt/openmpi-2.0.1/lib/libmpi.so.20(ompi_mpi_init+0x4b6) [0x7f16aa644e66]
   11  /opt/openmpi-2.0.1/lib/libmpi.so.20(MPI_Init+0x8b) [0x7f16aa663fbb]
   12  phyml-mpi() [0x401cf9]
   13  /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xed) [0x7f16aa0467ed]
   14  phyml-mpi() [0x4028f9]
===================
Warning: Conflicting CPU frequencies detected, using: 2001.000000.
MXM: Got signal 11 (Segmentation fault)
==== backtrace ====
    0  /lib/x86_64-linux-gnu/libc.so.6(+0x360b0) [0x7f0cdb7040b0]
    1  /usr/lib/libibverbs.so.1(ibv_dealloc_pd+0x3) [0x7f0cd3bf20c3]
    2  /opt/mellanox/mxm/lib/libmxm.so.2(mxm_ib_init_devices+0x401) 
[0x7f0cd10100a1]
    3  /opt/mellanox/mxm/lib/libmxm.so.2(+0x158df) [0x7f0cd100f8df]
    4  /opt/mellanox/mxm/lib/libmxm.so.2(mxm_components_init+0x43) 
[0x7f0cd1011b63]
    5  /opt/mellanox/mxm/lib/libmxm.so.2(mxm_init+0x137) [0x7f0cd1014a57]
    6  
/opt/openmpi-2.0.1/lib/openmpi/mca_pml_yalla.so(mca_pml_yalla_open+0x107) 
[0x7f0cd1d2b577]
    7  
/opt/openmpi-2.0.1/lib/libopen-pal.so.20(mca_base_framework_components_open+0xc5)
 [0x7f0cdb1b43a5]
    8  /opt/openmpi-2.0.1/lib/libmpi.so.20(+0x9ee09) [0x7f0cdbd47e09]
    9  /opt/openmpi-2.0.1/lib/libopen-pal.so.20(mca_base_framework_open+0x9d) 
[0x7f0cdb1bd22d]
   10  /opt/openmpi-2.0.1/lib/libmpi.so.20(ompi_mpi_init+0x4b6) [0x7f0cdbcede66]
   11  /opt/openmpi-2.0.1/lib/libmpi.so.20(MPI_Init+0x8b) [0x7f0cdbd0cfbb]
   12  phyml-mpi() [0x401cf9]
   13  /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xed) [0x7f0cdb6ef7ed]
   14  phyml-mpi() [0x4028f9]
===================
Warning: Conflicting CPU frequencies detected, using: 2001.000000.
MXM: Got signal 11 (Segmentation fault)
==== backtrace ====
    0  /lib/x86_64-linux-gnu/libc.so.6(+0x360b0) [0x7f14f8a580b0]
    1  /usr/lib/libibverbs.so.1(ibv_dealloc_pd+0x3) [0x7f14f4ff70c3]
    2  /opt/mellanox/mxm/lib/libmxm.so.2(mxm_ib_init_devices+0x401) 
[0x7f14ee2be0a1]
    3  /opt/mellanox/mxm/lib/libmxm.so.2(+0x158df) [0x7f14ee2bd8df]
    4  /opt/mellanox/mxm/lib/libmxm.so.2(mxm_components_init+0x43) 
[0x7f14ee2bfb63]
    5  /opt/mellanox/mxm/lib/libmxm.so.2(mxm_init+0x137) [0x7f14ee2c2a57]
    6  
/opt/openmpi-2.0.1/lib/openmpi/mca_pml_yalla.so(mca_pml_yalla_open+0x107) 
[0x7f14eefd9577]
    7  
/opt/openmpi-2.0.1/lib/libopen-pal.so.20(mca_base_framework_components_open+0xc5)
 [0x7f14f85083a5]
    8  /opt/openmpi-2.0.1/lib/libmpi.so.20(+0x9ee09) [0x7f14f909be09]
    9  /opt/openmpi-2.0.1/lib/libopen-pal.so.20(mca_base_framework_open+0x9d) 
[0x7f14f851122d]
   10  /opt/openmpi-2.0.1/lib/libmpi.so.20(ompi_mpi_init+0x4b6) [0x7f14f9041e66]
   11  /opt/openmpi-2.0.1/lib/libmpi.so.20(MPI_Init+0x8b) [0x7f14f9060fbb]
   12  phyml-mpi() [0x401cf9]
   13  /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xed) [0x7f14f8a437ed]
   14  phyml-mpi() [0x4028f9]
===================
[NODE7:155949] [[55778,0],2] usock_peer_send_blocking: send() to socket 56 
failed: Broken pipe (32)
[NODE7:155949] [[55778,0],2] ORTE_ERROR_LOG: Unreachable in file 
oob_usock_connection.c at line 315
[NODE7:155949] [[55778,0],2]-[[55778,1],16] usock_peer_accept: 
usock_peer_send_connect_ack failed
[NODE7:155949] [[55778,0],2] usock_peer_send_blocking: send() to socket 59 
failed: Broken pipe (32)
[NODE7:155949] [[55778,0],2] ORTE_ERROR_LOG: Unreachable in file 
oob_usock_connection.c at line 315
[NODE7:155949] [[55778,0],2]-[[55778,1],17] usock_peer_accept: 
usock_peer_send_connect_ack failed
Warning: Conflicting CPU frequencies detected, using: 2201.000000.
Warning: Conflicting CPU frequencies detected, using: 2001.000000.
MXM: Got signal 11 (Segmentation fault)
==== backtrace ====
[NODE7:155953] PML yalla cannot be selected
Warning: Conflicting CPU frequencies detected, using: 2201.000000.
[NODE7:155954] PML yalla cannot be selected
Warning: Conflicting CPU frequencies detected, using: 2201.000000.
[NODE7:155955] PML yalla cannot be selected
Warning: Conflicting CPU frequencies detected, using: 2201.000000.
Warning: Conflicting CPU frequencies detected, using: 2201.000000.
Warning: Conflicting CPU frequencies detected, using: 2201.000000.
[NODE2:22125] [[55778,0],1] usock_peer_send_blocking: send() to socket 51 
failed: Broken pipe (32)
[NODE2:22125] [[55778,0],1] ORTE_ERROR_LOG: Unreachable in file 
oob_usock_connection.c at line 315
[NODE2:22125] [[55778,0],1]-[[55778,1],1] usock_peer_accept: 
usock_peer_send_connect_ack failed
[NODE2:22125] [[55778,0],1] usock_peer_send_blocking: send() to socket 57 
failed: Broken pipe (32)
[NODE2:22125] [[55778,0],1] ORTE_ERROR_LOG: Unreachable in file 
oob_usock_connection.c at line 315
[NODE2:22125] [[55778,0],1]-[[55778,1],2] usock_peer_accept: 
usock_peer_send_connect_ack failed
[NODE2:22125] [[55778,0],1] usock_peer_send_blocking: send() to socket 59 
failed: Broken pipe (32)
[NODE2:22125] [[55778,0],1] ORTE_ERROR_LOG: Unreachable in file 
oob_usock_connection.c at line 315
[NODE2:22125] [[55778,0],1]-[[55778,1],3] usock_peer_accept: 
usock_peer_send_connect_ack failed
[NODE2:22125] [[55778,0],1] usock_peer_send_blocking: send() to socket 68 
failed: Broken pipe (32)
[NODE2:22125] [[55778,0],1] ORTE_ERROR_LOG: Unreachable in file 
oob_usock_connection.c at line 315
[NODE2:22125] [[55778,0],1]-[[55778,1],4] usock_peer_accept: 
usock_peer_send_connect_ack failed
[NODE2:22125] [[55778,0],1] usock_peer_send_blocking: send() to socket 72 
failed: Broken pipe (32)
[NODE2:22125] [[55778,0],1] ORTE_ERROR_LOG: Unreachable in file 
oob_usock_connection.c at line 315
[NODE2:22125] [[55778,0],1]-[[55778,1],5] usock_peer_accept: 
usock_peer_send_connect_ack failed
[NODE2:22125] [[55778,0],1] usock_peer_send_blocking: send() to socket 75 
failed: Broken pipe (32)
[NODE2:22125] [[55778,0],1] ORTE_ERROR_LOG: Unreachable in file 
oob_usock_connection.c at line 315
[NODE2:22125] [[55778,0],1]-[[55778,1],6] usock_peer_accept: 
usock_peer_send_connect_ack failed
[NODE2:22125] [[55778,0],1] usock_peer_send_blocking: send() to socket 35 
failed: Broken pipe (32)
[NODE2:22125] [[55778,0],1] ORTE_ERROR_LOG: Unreachable in file 
oob_usock_connection.c at line 315
[NODE2:22125] [[55778,0],1]-[[55778,1],7] usock_peer_accept: 
usock_peer_send_connect_ack failed
[NODE2:22125] [[55778,0],1] usock_peer_send_blocking: send() to socket 62 
failed: Broken pipe (32)
[NODE2:22125] [[55778,0],1] ORTE_ERROR_LOG: Unreachable in file 
oob_usock_connection.c at line 315
[NODE2:22125] [[55778,0],1]-[[55778,1],9] usock_peer_accept: 
usock_peer_send_connect_ack failed
[NODE2:22125] [[55778,0],1] usock_peer_send_blocking: send() to socket 54 
failed: Broken pipe (32)
[NODE2:22125] [[55778,0],1] ORTE_ERROR_LOG: Unreachable in file 
oob_usock_connection.c at line 315
[NODE2:22125] [[55778,0],1]-[[55778,1],10] usock_peer_accept: 
usock_peer_send_connect_ack failed
Warning: Conflicting CPU frequencies detected, using: 1900.000000.
MXM: Got signal 11 (Segmentation fault)
==== backtrace ====
Warning: Conflicting CPU frequencies detected, using: 1900.000000.
[NODE2:22130] PML yalla cannot be selected
Warning: Conflicting CPU frequencies detected, using: 1900.000000.
MXM: Got signal 11 (Segmentation fault)
==== backtrace ====
    0  /lib/x86_64-linux-gnu/libc.so.6(+0x36150) [0x7fd3a1cb2150]
    1  /usr/lib/libibverbs.so.1(ibv_dealloc_pd+0) [0x7fd39de2e0c0]
    2  /opt/mellanox/mxm/lib/libmxm.so.2(mxm_ib_init_devices+0x401) 
[0x7fd39751e0a1]
    3  /opt/mellanox/mxm/lib/libmxm.so.2(+0x158df) [0x7fd39751d8df]
    4  /opt/mellanox/mxm/lib/libmxm.so.2(mxm_components_init+0x43) 
[0x7fd39751fb63]
    5  /opt/mellanox/mxm/lib/libmxm.so.2(mxm_init+0x137) [0x7fd397522a57]
    6  
/opt/openmpi-2.0.1/lib/openmpi/mca_pml_yalla.so(mca_pml_yalla_open+0x107) 
[0x7fd39c17f577]
    7  
/opt/openmpi-2.0.1/lib/libopen-pal.so.20(mca_base_framework_components_open+0xc5)
 [0x7fd3a17623a5]
    8  /opt/openmpi-2.0.1/lib/libmpi.so.20(+0x9ee09) [0x7fd3a22f6e09]
    9  /opt/openmpi-2.0.1/lib/libopen-pal.so.20(mca_base_framework_open+0x9d) 
[0x7fd3a176b22d]
   10  /opt/openmpi-2.0.1/lib/libmpi.so.20(ompi_mpi_init+0x4b6) [0x7fd3a229ce66]
   11  /opt/openmpi-2.0.1/lib/libmpi.so.20(MPI_Init+0x8b) [0x7fd3a22bbfbb]
   12  phyml-mpi() [0x401cf9]
   13  /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xed) [0x7fd3a1c9d76d]
   14  phyml-mpi() [0x4028f9]
===================
Warning: Conflicting CPU frequencies detected, using: 1900.000000.
[NODE2:22132] PML yalla cannot be selected
Warning: Conflicting CPU frequencies detected, using: 1900.000000.
[NODE2:22134] PML yalla cannot be selected
Warning: Conflicting CPU frequencies detected, using: 1900.000000.
[NODE2:22137] PML yalla cannot be selected
Warning: Conflicting CPU frequencies detected, using: 2001.000000.
MXM: Got signal 11 (Segmentation fault)
==== backtrace ====
    0  /lib/x86_64-linux-gnu/libc.so.6(+0x36150) [0x7fdf6f257150]
    1  /usr/lib/libibverbs.so.1(ibv_dealloc_pd+0) [0x7fdf673bb0c0]
    2  /opt/mellanox/mxm/lib/libmxm.so.2(mxm_ib_init_devices+0x401) 
[0x7fdf64bfa0a1]
    3  /opt/mellanox/mxm/lib/libmxm.so.2(+0x158df) [0x7fdf64bf98df]
    4  /opt/mellanox/mxm/lib/libmxm.so.2(mxm_components_init+0x43) 
[0x7fdf64bfbb63]
    5  /opt/mellanox/mxm/lib/libmxm.so.2(mxm_init+0x137) [0x7fdf64bfea57]
    6  
/opt/openmpi-2.0.1/lib/openmpi/mca_pml_yalla.so(mca_pml_yalla_open+0x107) 
[0x7fdf6570c577]
    7  
/opt/openmpi-2.0.1/lib/libopen-pal.so.20(mca_base_framework_components_open+0xc5)
 [0x7fdf6ed073a5]
    8  /opt/openmpi-2.0.1/lib/libmpi.so.20(+0x9ee09) [0x7fdf6f89be09]
    9  /opt/openmpi-2.0.1/lib/libopen-pal.so.20(mca_base_framework_open+0x9d) 
[0x7fdf6ed1022d]
   10  /opt/openmpi-2.0.1/lib/libmpi.so.20(ompi_mpi_init+0x4b6) [0x7fdf6f841e66]
   11  /opt/openmpi-2.0.1/lib/libmpi.so.20(MPI_Init+0x8b) [0x7fdf6f860fbb]
   12  phyml-mpi() [0x401cf9]
   13  /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xed) [0x7fdf6f24276d]
   14  phyml-mpi() [0x4028f9]
===================
Warning: Conflicting CPU frequencies detected, using: 2001.000000.
[NODE2:22144] PML yalla cannot be selected
Warning: Conflicting CPU frequencies detected, using: 2001.000000.
MXM: Got signal 11 (Segmentation fault)
==== backtrace ====
    0  /lib/x86_64-linux-gnu/libc.so.6(+0x36150) [0x7ff41e98a150]
    1  /usr/lib/libibverbs.so.1(ibv_dealloc_pd+0) [0x7ff4169b90c0]
    2  /opt/mellanox/mxm/lib/libmxm.so.2(mxm_ib_init_devices+0x401) 
[0x7ff4141f80a1]
    3  /opt/mellanox/mxm/lib/libmxm.so.2(+0x158df) [0x7ff4141f78df]
    4  /opt/mellanox/mxm/lib/libmxm.so.2(mxm_components_init+0x43) 
[0x7ff4141f9b63]
    5  /opt/mellanox/mxm/lib/libmxm.so.2(mxm_init+0x137) [0x7ff4141fca57]
    6  
/opt/openmpi-2.0.1/lib/openmpi/mca_pml_yalla.so(mca_pml_yalla_open+0x107) 
[0x7ff414d0a577]
    7  
/opt/openmpi-2.0.1/lib/libopen-pal.so.20(mca_base_framework_components_open+0xc5)
 [0x7ff41e43a3a5]
    8  /opt/openmpi-2.0.1/lib/libmpi.so.20(+0x9ee09) [0x7ff41efcee09]
    9  /opt/openmpi-2.0.1/lib/libopen-pal.so.20(mca_base_framework_open+0x9d) 
[0x7ff41e44322d]
   10  /opt/openmpi-2.0.1/lib/libmpi.so.20(ompi_mpi_init+0x4b6) [0x7ff41ef74e66]
   11  /opt/openmpi-2.0.1/lib/libmpi.so.20(MPI_Init+0x8b) [0x7ff41ef93fbb]
   12  phyml-mpi() [0x401cf9]
   13  /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xed) [0x7ff41e97576d]
   14  phyml-mpi() [0x4028f9]
===================
Warning: Conflicting CPU frequencies detected, using: 2001.000000.
[NODE2:22150] PML yalla cannot be selected
Warning: Conflicting CPU frequencies detected, using: 2001.000000.
[NODE2:22154] PML yalla cannot be selected
[NODE2:21943] 11 more processes have sent help message help-btl-vader.txt / 
knem fail open
[NODE2:21943] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help 
/ error messages
[NODE2:21943] 11 more processes have sent help message help-btl-vader.txt / 
knem requested but not available
_______________________________________________
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Reply via email to