Hi,
I am seriously struggling with running OpenMPI on my SGE HPC under
Ubuntu 12.04:
* Every time I run a command from OpenMPI, let's say ompi_info, I get
the following as the first line of the output:
o Warning: Conflicting CPU frequencies detected, using: 2001.000000.
* I get the attached error and output when I run the following script
with qsub (making use of queues):
o #!/bin/bash
#$-cwd
#$ -S /bin/bash
#$ -V
#$ -q normal
#$ -pe mpi 40
#$ -P Lab219
#$ -o output
#$ -e error
module load PhyML/3.3
mpirun --mca pml yalla --mca orte_base_help_aggregate 0 -np 40
phyml-mpi -i proteic -b 10 -d aa
It seems /dev/knem is set somehow so that it only allows "RDMA" user
group to execute it, but I did change the 10-knem.rules files such that
it is on mode "0666" and every user can use it.
I can't understand why the "Warning: Conflicting CPU frequencies
detected, using: 2001.000000.", the "MXM: Got signal 11 (Segmentation
fault)" and the "PML yalla cannot be selected" errors are happening,
and can't figure out at all how to tackle them.
I would appreciate a lot any minimal hint about this issues.
Thanks a lot in advance
Warning: Conflicting CPU frequencies detected, using: 2001.000000.
MXM: Got signal 11 (Segmentation fault)
==== backtrace ====
0 /lib/x86_64-linux-gnu/libc.so.6(+0x360b0) [0x7f4d1b2ae0b0]
1 /usr/lib/libibverbs.so.1(ibv_dealloc_pd+0) [0x7f4d137dc0c0]
2 /opt/mellanox/mxm/lib/libmxm.so.2(mxm_ib_init_devices+0x401)
[0x7f4d10bfa0a1]
3 /opt/mellanox/mxm/lib/libmxm.so.2(+0x158df) [0x7f4d10bf98df]
4 /opt/mellanox/mxm/lib/libmxm.so.2(mxm_components_init+0x43)
[0x7f4d10bfbb63]
5 /opt/mellanox/mxm/lib/libmxm.so.2(mxm_init+0x137) [0x7f4d10bfea57]
6
/opt/openmpi-2.0.1/lib/openmpi/mca_pml_yalla.so(mca_pml_yalla_open+0x107)
[0x7f4d11889577]
7
/opt/openmpi-2.0.1/lib/libopen-pal.so.20(mca_base_framework_components_open+0xc5)
[0x7f4d1ad5e3a5]
8 /opt/openmpi-2.0.1/lib/libmpi.so.20(+0x9ee09) [0x7f4d1b8f1e09]
9 /opt/openmpi-2.0.1/lib/libopen-pal.so.20(mca_base_framework_open+0x9d)
[0x7f4d1ad6722d]
10 /opt/openmpi-2.0.1/lib/libmpi.so.20(ompi_mpi_init+0x4b6) [0x7f4d1b897e66]
11 /opt/openmpi-2.0.1/lib/libmpi.so.20(MPI_Init+0x8b) [0x7f4d1b8b6fbb]
12 phyml-mpi() [0x401cf9]
13 /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xed) [0x7f4d1b2997ed]
14 phyml-mpi() [0x4028f9]
===================
Warning: Conflicting CPU frequencies detected, using: 2001.000000.
Warning: Conflicting CPU frequencies detected, using: 2001.000000.
Warning: Conflicting CPU frequencies detected, using: 2001.000000.
[NODE4:23545] PML yalla cannot be selected
Warning: Conflicting CPU frequencies detected, using: 2001.000000.
[NODE4:23544] PML yalla cannot be selected
[NODE7:06478] [[7125,0],1] usock_peer_send_blocking: send() to socket 85
failed: Broken pipe (32)
[NODE7:06478] [[7125,0],1] ORTE_ERROR_LOG: Unreachable in file
oob_usock_connection.c at line 315
[NODE7:06478] [[7125,0],1]-[[7125,1],7] usock_peer_accept:
usock_peer_send_connect_ack failed
[NODE7:06478] [[7125,0],1] usock_peer_send_blocking: send() to socket 55
failed: Broken pipe (32)
[NODE7:06478] [[7125,0],1] ORTE_ERROR_LOG: Unreachable in file
oob_usock_connection.c at line 315
[NODE7:06478] [[7125,0],1]-[[7125,1],0] usock_peer_accept:
usock_peer_send_connect_ack failed
[NODE7:06478] [[7125,0],1] usock_peer_send_blocking: send() to socket 61
failed: Broken pipe (32)
[NODE7:06478] [[7125,0],1] ORTE_ERROR_LOG: Unreachable in file
oob_usock_connection.c at line 315
[NODE7:06478] [[7125,0],1]-[[7125,1],1] usock_peer_accept:
usock_peer_send_connect_ack failed
[NODE7:06478] [[7125,0],1] usock_peer_send_blocking: send() to socket 66
failed: Broken pipe (32)
[NODE7:06478] [[7125,0],1] ORTE_ERROR_LOG: Unreachable in file
oob_usock_connection.c at line 315
[NODE7:06478] [[7125,0],1]-[[7125,1],2] usock_peer_accept:
usock_peer_send_connect_ack failed
[NODE7:06478] [[7125,0],1] usock_peer_send_blocking: send() to socket 68
failed: Broken pipe (32)
[NODE7:06478] [[7125,0],1] ORTE_ERROR_LOG: Unreachable in file
oob_usock_connection.c at line 315
[NODE7:06478] [[7125,0],1]-[[7125,1],3] usock_peer_accept:
usock_peer_send_connect_ack failed
[NODE7:06478] [[7125,0],1] usock_peer_send_blocking: send() to socket 74
failed: Broken pipe (32)
[NODE7:06478] [[7125,0],1] ORTE_ERROR_LOG: Unreachable in file
oob_usock_connection.c at line 315
[NODE7:06478] [[7125,0],1]-[[7125,1],4] usock_peer_accept:
usock_peer_send_connect_ack failed
[NODE7:06478] [[7125,0],1] usock_peer_send_blocking: send() to socket 77
failed: Broken pipe (32)
[NODE7:06478] [[7125,0],1] ORTE_ERROR_LOG: Unreachable in file
oob_usock_connection.c at line 315
[NODE7:06478] [[7125,0],1]-[[7125,1],5] usock_peer_accept:
usock_peer_send_connect_ack failed
[NODE7:06478] [[7125,0],1] usock_peer_send_blocking: send() to socket 87
failed: Broken pipe (32)
[NODE7:06478] [[7125,0],1] ORTE_ERROR_LOG: Unreachable in file
oob_usock_connection.c at line 315
[NODE7:06478] [[7125,0],1]-[[7125,1],6] usock_peer_accept:
usock_peer_send_connect_ack failed
Warning: Conflicting CPU frequencies detected, using: 2201.000000.
MXM: Got signal 11 (Segmentation fault)
==== backtrace ====
0 /lib/x86_64-linux-gnu/libc.so.6(+0x360b0) [0x7f1b7b3740b0]
1 /usr/lib/libibverbs.so.1(ibv_close_device+0x13) [0x7f1b731c59c3]
2 /opt/mellanox/mxm/lib/libmxm.so.2(mxm_ib_init_devices+0x40d)
[0x7f1b70bfe0ad]
3 /opt/mellanox/mxm/lib/libmxm.so.2(+0x158df) [0x7f1b70bfd8df]
4 /opt/mellanox/mxm/lib/libmxm.so.2(mxm_components_init+0x43)
[0x7f1b70bffb63]
5 /opt/mellanox/mxm/lib/libmxm.so.2(mxm_init+0x137) [0x7f1b70c02a57]
6
/opt/openmpi-2.0.1/lib/openmpi/mca_pml_yalla.so(mca_pml_yalla_open+0x107)
[0x7f1b70e41577]
7
/opt/openmpi-2.0.1/lib/libopen-pal.so.20(mca_base_framework_components_open+0xc5)
[0x7f1b7ae243a5]
8 /opt/openmpi-2.0.1/lib/libmpi.so.20(+0x9ee09) [0x7f1b7b9b7e09]
9 /opt/openmpi-2.0.1/lib/libopen-pal.so.20(mca_base_framework_open+0x9d)
[0x7f1b7ae2d22d]
10 /opt/openmpi-2.0.1/lib/libmpi.so.20(ompi_mpi_init+0x4b6) [0x7f1b7b95de66]
11 /opt/openmpi-2.0.1/lib/libmpi.so.20(MPI_Init+0x8b) [0x7f1b7b97cfbb]
12 phyml-mpi() [0x401cf9]
13 /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xed) [0x7f1b7b35f7ed]
14 phyml-mpi() [0x4028f9]
===================
Warning: Conflicting CPU frequencies detected, using: 2201.000000.
[NODE7:06483] PML yalla cannot be selected
Warning: Conflicting CPU frequencies detected, using: 2201.000000.
Warning: Conflicting CPU frequencies detected, using: 2201.000000.
[NODE3:45371] [[7125,0],2] usock_peer_send_blocking: send() to socket 52
failed: Broken pipe (32)
[NODE3:45371] [[7125,0],2] ORTE_ERROR_LOG: Unreachable in file
oob_usock_connection.c at line 315
[NODE3:45371] [[7125,0],2]-[[7125,1],16] usock_peer_accept:
usock_peer_send_connect_ack failed
[NODE3:45371] [[7125,0],2] usock_peer_send_blocking: send() to socket 61
failed: Broken pipe (32)
[NODE3:45371] [[7125,0],2] ORTE_ERROR_LOG: Unreachable in file
oob_usock_connection.c at line 315
[NODE3:45371] [[7125,0],2]-[[7125,1],18] usock_peer_accept:
usock_peer_send_connect_ack failed
[NODE3:45371] [[7125,0],2] usock_peer_send_blocking: send() to socket 67
failed: Broken pipe (32)
[NODE3:45371] [[7125,0],2] ORTE_ERROR_LOG: Unreachable in file
oob_usock_connection.c at line 315
[NODE3:45371] [[7125,0],2]-[[7125,1],19] usock_peer_accept:
usock_peer_send_connect_ack failed
[NODE3:45371] [[7125,0],2] usock_peer_send_blocking: send() to socket 83
failed: Broken pipe (32)
[NODE3:45371] [[7125,0],2] ORTE_ERROR_LOG: Unreachable in file
oob_usock_connection.c at line 315
[NODE3:45371] [[7125,0],2]-[[7125,1],21] usock_peer_accept:
usock_peer_send_connect_ack failed
[NODE3:45371] [[7125,0],2] usock_peer_send_blocking: send() to socket 84
failed: Broken pipe (32)
[NODE3:45371] [[7125,0],2] ORTE_ERROR_LOG: Unreachable in file
oob_usock_connection.c at line 315
[NODE3:45371] [[7125,0],2]-[[7125,1],23] usock_peer_accept:
usock_peer_send_connect_ack failed
[NODE3:45371] [[7125,0],2] usock_peer_send_blocking: send() to socket 90
failed: Broken pipe (32)
[NODE3:45371] [[7125,0],2] ORTE_ERROR_LOG: Unreachable in file
oob_usock_connection.c at line 315
[NODE3:45371] [[7125,0],2]-[[7125,1],26] usock_peer_accept:
usock_peer_send_connect_ack failed
Warning: Conflicting CPU frequencies detected, using: 2001.000000.
[NODE3:45375] PML yalla cannot be selected
Warning: Conflicting CPU frequencies detected, using: 2001.000000.
MXM: Got signal 11 (Segmentation fault)
==== backtrace ====
0 /lib/x86_64-linux-gnu/libc.so.6(+0x360b0) [0x7f86c1a0e0b0]
1 /usr/lib/libibverbs.so.1(ibv_dealloc_pd+0x3) [0x7f86bdfad0c3]
2 /opt/mellanox/mxm/lib/libmxm.so.2(mxm_ib_init_devices+0x401)
[0x7f86b73150a1]
3 /opt/mellanox/mxm/lib/libmxm.so.2(+0x158df) [0x7f86b73148df]
4 /opt/mellanox/mxm/lib/libmxm.so.2(mxm_components_init+0x43)
[0x7f86b7316b63]
5 /opt/mellanox/mxm/lib/libmxm.so.2(mxm_init+0x137) [0x7f86b7319a57]
6
/opt/openmpi-2.0.1/lib/openmpi/mca_pml_yalla.so(mca_pml_yalla_open+0x107)
[0x7f86bc0e6577]
7
/opt/openmpi-2.0.1/lib/libopen-pal.so.20(mca_base_framework_components_open+0xc5)
[0x7f86c14be3a5]
8 /opt/openmpi-2.0.1/lib/libmpi.so.20(+0x9ee09) [0x7f86c2051e09]
9 /opt/openmpi-2.0.1/lib/libopen-pal.so.20(mca_base_framework_open+0x9d)
[0x7f86c14c722d]
10 /opt/openmpi-2.0.1/lib/libmpi.so.20(ompi_mpi_init+0x4b6) [0x7f86c1ff7e66]
11 /opt/openmpi-2.0.1/lib/libmpi.so.20(MPI_Init+0x8b) [0x7f86c2016fbb]
12 phyml-mpi() [0x401cf9]
13 /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xed) [0x7f86c19f97ed]
14 phyml-mpi() [0x4028f9]
===================
Warning: Conflicting CPU frequencies detected, using: 2001.000000.
MXM: Got signal 11 (Segmentation fault)
==== backtrace ====
0 /lib/x86_64-linux-gnu/libc.so.6(+0x360b0) [0x7f0f5ea8a0b0]
1 /usr/lib/libibverbs.so.1(ibv_dealloc_pd+0x3) [0x7f0f56fce0c3]
2 /opt/mellanox/mxm/lib/libmxm.so.2(mxm_ib_init_devices+0x401)
[0x7f0f543ec0a1]
3 /opt/mellanox/mxm/lib/libmxm.so.2(+0x158df) [0x7f0f543eb8df]
4 /opt/mellanox/mxm/lib/libmxm.so.2(mxm_components_init+0x43)
[0x7f0f543edb63]
5 /opt/mellanox/mxm/lib/libmxm.so.2(mxm_init+0x137) [0x7f0f543f0a57]
6
/opt/openmpi-2.0.1/lib/openmpi/mca_pml_yalla.so(mca_pml_yalla_open+0x107)
[0x7f0f55107577]
7
/opt/openmpi-2.0.1/lib/libopen-pal.so.20(mca_base_framework_components_open+0xc5)
[0x7f0f5e53a3a5]
8 /opt/openmpi-2.0.1/lib/libmpi.so.20(+0x9ee09) [0x7f0f5f0cde09]
9 /opt/openmpi-2.0.1/lib/libopen-pal.so.20(mca_base_framework_open+0x9d)
[0x7f0f5e54322d]
10 /opt/openmpi-2.0.1/lib/libmpi.so.20(ompi_mpi_init+0x4b6) [0x7f0f5f073e66]
11 /opt/openmpi-2.0.1/lib/libmpi.so.20(MPI_Init+0x8b) [0x7f0f5f092fbb]
12 phyml-mpi() [0x401cf9]
13 /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xed) [0x7f0f5ea757ed]
14 phyml-mpi() [0x4028f9]
===================
Warning: Conflicting CPU frequencies detected, using: 2001.000000.
MXM: Got signal 11 (Segmentation fault)
==== backtrace ====
0 /lib/x86_64-linux-gnu/libc.so.6(+0x360b0) [0x7fef1d1fe0b0]
1 /usr/lib/libibverbs.so.1(ibv_dealloc_pd+0x3) [0x7fef1979d0c3]
2 /opt/mellanox/mxm/lib/libmxm.so.2(mxm_ib_init_devices+0x401)
[0x7fef12ace0a1]
3 /opt/mellanox/mxm/lib/libmxm.so.2(+0x158df) [0x7fef12acd8df]
4 /opt/mellanox/mxm/lib/libmxm.so.2(mxm_components_init+0x43)
[0x7fef12acfb63]
5 /opt/mellanox/mxm/lib/libmxm.so.2(mxm_init+0x137) [0x7fef12ad2a57]
6
/opt/openmpi-2.0.1/lib/openmpi/mca_pml_yalla.so(mca_pml_yalla_open+0x107)
[0x7fef137e9577]
7
/opt/openmpi-2.0.1/lib/libopen-pal.so.20(mca_base_framework_components_open+0xc5)
[0x7fef1ccae3a5]
8 /opt/openmpi-2.0.1/lib/libmpi.so.20(+0x9ee09) [0x7fef1d841e09]
9 /opt/openmpi-2.0.1/lib/libopen-pal.so.20(mca_base_framework_open+0x9d)
[0x7fef1ccb722d]
10 /opt/openmpi-2.0.1/lib/libmpi.so.20(ompi_mpi_init+0x4b6) [0x7fef1d7e7e66]
11 /opt/openmpi-2.0.1/lib/libmpi.so.20(MPI_Init+0x8b) [0x7fef1d806fbb]
12 phyml-mpi() [0x401cf9]
13 /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xed) [0x7fef1d1e97ed]
14 phyml-mpi() [0x4028f9]
===================
Warning: Conflicting CPU frequencies detected, using: 2001.000000.
MXM: Got signal 11 (Segmentation fault)
==== backtrace ====
Warning: Conflicting CPU frequencies detected, using: 2001.000000.
0 /lib/x86_64-linux-gnu/libc.so.6(+0x360b0) [0x7f7c6cb4c0b0]
1 /usr/lib/libibverbs.so.1(ibv_dealloc_pd+0x3) [0x7f7c690eb0c3]
2 /opt/mellanox/mxm/lib/libmxm.so.2(mxm_ib_init_devices+0x401)
[0x7f7c624c30a1]
3 /opt/mellanox/mxm/lib/libmxm.so.2(+0x158df) [0x7f7c624c28df]
4 /opt/mellanox/mxm/lib/libmxm.so.2(mxm_components_init+0x43)
[0x7f7c624c4b63]
5 /opt/mellanox/mxm/lib/libmxm.so.2(mxm_init+0x137) [0x7f7c624c7a57]
6
/opt/openmpi-2.0.1/lib/openmpi/mca_pml_yalla.so(mca_pml_yalla_open+0x107)
[0x7f7c631de577]
7
/opt/openmpi-2.0.1/lib/libopen-pal.so.20(mca_base_framework_components_open+0xc5)
[0x7f7c6c5fc3a5]
8 /opt/openmpi-2.0.1/lib/libmpi.so.20(+0x9ee09) [0x7f7c6d18fe09]
9 /opt/openmpi-2.0.1/lib/libopen-pal.so.20(mca_base_framework_open+0x9d)
[0x7f7c6c60522d]
10 /opt/openmpi-2.0.1/lib/libmpi.so.20(ompi_mpi_init+0x4b6) [0x7f7c6d135e66]
11 /opt/openmpi-2.0.1/lib/libmpi.so.20(MPI_Init+0x8b) [0x7f7c6d154fbb]
12 phyml-mpi() [0x401cf9]
13 /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xed) [0x7f7c6cb377ed]
14 phyml-mpi() [0x4028f9]
===================
--------------------------------------------------------------------------
WARING: Open MPI failed to open the /dev/knem device due to a
permissions problem. Please check with your system administrator to
get the permissions fixed, or set the btl_vader_single_copy_mechanism
MCA variable to none to silence this warning and run without knem
support.
Local host: NODE4
/dev/knem permissions: 020660
--------------------------------------------------------------------------
--------------------------------------------------------------------------
WARNING: Linux kernel knem support was requested via the
btl_vader_single_copy_mechanism MCA parameter, but Knem support was either not
compiled into this Open MPI installation, or Knem support was unable
to be activated in this process.
The vader BTL will fall back on another single-copy mechanism if one
is available. This may result in lower performance.
Local host: NODE4
--------------------------------------------------------------------------
--------------------------------------------------------------------------
Open MPI failed to open the /dev/knem device due to a permissions
problem. Please check with your system administrator to get the
permissions fixed, or set the btl_sm_use_knem MCA parameter to 0 to
run without /dev/knem support.
Local host: NODE4
/dev/knem permissions: 020660
--------------------------------------------------------------------------
[1485442405.278010] [NODE4:23543:0] ib_dev.c:159 MXM ERROR Failed to
open uverbs0
--------------------------------------------------------------------------
WARING: Open MPI failed to open the /dev/knem device due to a
permissions problem. Please check with your system administrator to
get the permissions fixed, or set the btl_vader_single_copy_mechanism
MCA variable to none to silence this warning and run without knem
support.
Local host: NODE4
/dev/knem permissions: 020660
--------------------------------------------------------------------------
--------------------------------------------------------------------------
WARNING: Linux kernel knem support was requested via the
btl_vader_single_copy_mechanism MCA parameter, but Knem support was either not
compiled into this Open MPI installation, or Knem support was unable
to be activated in this process.
The vader BTL will fall back on another single-copy mechanism if one
is available. This may result in lower performance.
Local host: NODE4
--------------------------------------------------------------------------
--------------------------------------------------------------------------
Open MPI failed to open the /dev/knem device due to a permissions
problem. Please check with your system administrator to get the
permissions fixed, or set the btl_sm_use_knem MCA parameter to 0 to
run without /dev/knem support.
Local host: NODE4
/dev/knem permissions: 020660
--------------------------------------------------------------------------
--------------------------------------------------------------------------
WARING: Open MPI failed to open the /dev/knem device due to a
permissions problem. Please check with your system administrator to
get the permissions fixed, or set the btl_vader_single_copy_mechanism
MCA variable to none to silence this warning and run without knem
support.
Local host: NODE4
/dev/knem permissions: 020660
--------------------------------------------------------------------------
--------------------------------------------------------------------------
WARNING: Linux kernel knem support was requested via the
btl_vader_single_copy_mechanism MCA parameter, but Knem support was either not
compiled into this Open MPI installation, or Knem support was unable
to be activated in this process.
The vader BTL will fall back on another single-copy mechanism if one
is available. This may result in lower performance.
Local host: NODE4
--------------------------------------------------------------------------
[1485442405.280364] [NODE4:23545:0] ib_dev.c:159 MXM ERROR Failed to
open uverbs0
--------------------------------------------------------------------------
Open MPI failed to open the /dev/knem device due to a permissions
problem. Please check with your system administrator to get the
permissions fixed, or set the btl_sm_use_knem MCA parameter to 0 to
run without /dev/knem support.
Local host: NODE4
/dev/knem permissions: 020660
--------------------------------------------------------------------------
--------------------------------------------------------------------------
WARING: Open MPI failed to open the /dev/knem device due to a
permissions problem. Please check with your system administrator to
get the permissions fixed, or set the btl_vader_single_copy_mechanism
MCA variable to none to silence this warning and run without knem
support.
Local host: NODE4
/dev/knem permissions: 020660
--------------------------------------------------------------------------
--------------------------------------------------------------------------
WARNING: Linux kernel knem support was requested via the
btl_vader_single_copy_mechanism MCA parameter, but Knem support was either not
compiled into this Open MPI installation, or Knem support was unable
to be activated in this process.
The vader BTL will fall back on another single-copy mechanism if one
is available. This may result in lower performance.
Local host: NODE4
--------------------------------------------------------------------------
[1485442405.281661] [NODE4:23544:0] ib_dev.c:159 MXM ERROR Failed to
open uverbs0
--------------------------------------------------------------------------
WARING: Open MPI failed to open the /dev/knem device due to a
permissions problem. Please check with your system administrator to
get the permissions fixed, or set the btl_vader_single_copy_mechanism
MCA variable to none to silence this warning and run without knem
support.
Local host: NODE4
/dev/knem permissions: 020660
--------------------------------------------------------------------------
--------------------------------------------------------------------------
WARNING: Linux kernel knem support was requested via the
btl_vader_single_copy_mechanism MCA parameter, but Knem support was either not
compiled into this Open MPI installation, or Knem support was unable
to be activated in this process.
The vader BTL will fall back on another single-copy mechanism if one
is available. This may result in lower performance.
Local host: NODE4
--------------------------------------------------------------------------
--------------------------------------------------------------------------
Open MPI failed to open the /dev/knem device due to a permissions
problem. Please check with your system administrator to get the
permissions fixed, or set the btl_sm_use_knem MCA parameter to 0 to
run without /dev/knem support.
Local host: NODE4
/dev/knem permissions: 020660
--------------------------------------------------------------------------
--------------------------------------------------------------------------
Open MPI failed to open the /dev/knem device due to a permissions
problem. Please check with your system administrator to get the
permissions fixed, or set the btl_sm_use_knem MCA parameter to 0 to
run without /dev/knem support.
Local host: NODE4
/dev/knem permissions: 020660
--------------------------------------------------------------------------
--------------------------------------------------------------------------
No components were able to be opened in the pml framework.
This typically means that either no components of this type were
installed, or none of the installed componnets can be loaded.
Sometimes this means that shared libraries required by these
components are unable to be found/loaded.
Host: NODE4
Framework: pml
--------------------------------------------------------------------------
[1485442405.283457] [NODE4:23547:0] ib_dev.c:159 MXM ERROR Failed to
open uverbs0
[1485442405.283678] [NODE4:23546:0] ib_dev.c:159 MXM ERROR Failed to
open uverbs0
--------------------------------------------------------------------------
WARING: Open MPI failed to open the /dev/knem device due to a
permissions problem. Please check with your system administrator to
get the permissions fixed, or set the btl_vader_single_copy_mechanism
MCA variable to none to silence this warning and run without knem
support.
Local host: NODE4
/dev/knem permissions: 020660
--------------------------------------------------------------------------
--------------------------------------------------------------------------
No components were able to be opened in the pml framework.
This typically means that either no components of this type were
installed, or none of the installed componnets can be loaded.
Sometimes this means that shared libraries required by these
components are unable to be found/loaded.
Host: NODE4
Framework: pml
--------------------------------------------------------------------------
--------------------------------------------------------------------------
WARNING: Linux kernel knem support was requested via the
btl_vader_single_copy_mechanism MCA parameter, but Knem support was either not
compiled into this Open MPI installation, or Knem support was unable
to be activated in this process.
The vader BTL will fall back on another single-copy mechanism if one
is available. This may result in lower performance.
Local host: NODE4
--------------------------------------------------------------------------
[1485443326.409841] [NODE7:6482 :0] ib_dev.c:159 MXM ERROR Failed to
open uverbs0
[1485443326.410185] [NODE7:6483 :0] ib_dev.c:159 MXM ERROR Failed to
open uverbs0
[1485443326.411360] [NODE7:6484 :0] ib_dev.c:159 MXM ERROR Failed to
open uverbs0
[1485443326.411390] [NODE7:6484 :0] shm.c:69 MXM WARN Unable to
close the KNEM device file
[1485443326.411085] [NODE7:6485 :0] ib_dev.c:159 MXM ERROR Failed to
open uverbs0
[1485442357.459739] [NODE3:45375:0] ib_dev.c:159 MXM ERROR Failed to
open uverbs0
[1485442357.461696] [NODE3:45377:0] ib_dev.c:159 MXM ERROR Failed to
open uverbs0
[1485442357.464724] [NODE3:45378:0] ib_dev.c:159 MXM ERROR Failed to
open uverbs0
[1485442357.465128] [NODE3:45379:0] ib_dev.c:159 MXM ERROR Failed to
open uverbs0
[1485442357.466958] [NODE3:45384:0] ib_dev.c:159 MXM ERROR Failed to
open uverbs0
_______________________________________________
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users