Hi all,
I want to build OpenMPI-1.2.5 on my Infiniband cluster which has OFED-2.1
installed.
I configured OpenMPI as:
----------------------------------------------------------------------------
./configure --prefix=/exports/home/local/Cluster-Apps/openmpi/gcc/64/1.2.5 \
--enable-shared --enable-static --enable-debug \
--with-openib=/usr/local/Cluster-Apps/infinipath/2.1/ofed
----------------------------------------------------------------------------
And 'ompi_info | grep openib' only shows:
MCA btl: openib (MCA v1.0, API v1.0.1, Component v1.2.5)
I cannot see:
MCA mpool: openib (MCA v1.0, API v1.0, Component v1.0)
No idea why and if this will cause failure.
When I tried to run a MPI code with the option "--mca btl openib,self", It
failed to run with the following messages:
mpirun --mca btl openib,self -np 4 ./hello
--------------------------------------------------------------------------
Process 0.1.1 is unable to reach 0.1.0 for MPI communication.
If you specified the use of a BTL component, you may have
forgotten a component (such as "self") in the list of
usable components.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
Process 0.1.0 is unable to reach 0.1.1 for MPI communication.
If you specified the use of a BTL component, you may have
forgotten a component (such as "self") in the list of
usable components.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or
environment
problems. This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):
PML add procs failed
--> Returned "Unreachable" (-12) instead of "Success" (0)
--------------------------------------------------------------------------
*** An error occurred in MPI_Init
*** before MPI was initialized
*** MPI_ERRORS_ARE_FATAL (goodbye)
--------------------------------------------------------------------------
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or
environment
problems. This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):
PML add procs failed
--> Returned "Unreachable" (-12) instead of "Success" (0)
--------------------------------------------------------------------------
*** An error occurred in MPI_Init
*** before MPI was initialized
*** MPI_ERRORS_ARE_FATAL (goodbye)
--------------------------------------------------------------------------
Process 0.1.3 is unable to reach 0.1.0 for MPI communication.
If you specified the use of a BTL component, you may have
forgotten a component (such as "self") in the list of
usable components.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or
environment
problems. This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):
PML add procs failed
--> Returned "Unreachable" (-12) instead of "Success" (0)
--------------------------------------------------------------------------
*** An error occurred in MPI_Init
*** before MPI was initialized
*** MPI_ERRORS_ARE_FATAL (goodbye)
--------------------------------------------------------------------------
Process 0.1.2 is unable to reach 0.1.0 for MPI communication.
If you specified the use of a BTL component, you may have
forgotten a component (such as "self") in the list of
usable components.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or
environment
problems. This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):
PML add procs failed
--> Returned "Unreachable" (-12) instead of "Success" (0)
--------------------------------------------------------------------------
*** An error occurred in MPI_Init
*** before MPI was initialized
*** MPI_ERRORS_ARE_FATAL (goodbye)
[node128:16605] [0,0,0] ORTE_ERROR_LOG: Timeout in file
base/pls_base_orted_cmds.c at line 275
[node128:16605] [0,0,0] ORTE_ERROR_LOG: Timeout in file pls_rsh_module.c
at line 1166
[node128:16605] [0,0,0] ORTE_ERROR_LOG: Timeout in file errmgr_hnp.c at
line 90
[node128:16605] [0,0,0] ORTE_ERROR_LOG: Timeout in file
base/pls_base_orted_cmds.c at line 188
[node128:16605] [0,0,0] ORTE_ERROR_LOG: Timeout in file pls_rsh_module.c
at line 1198
--------------------------------------------------------------------------
mpirun was unable to cleanly terminate the daemons for this job. Returned
value Timeout instead of ORTE_SUCCESS.
--------------------------------------------------------------------------
Anyone know where I'm wrong? Thanks
--Yuan
Yuan Wan
--
Unix Section
Information Services Infrastructure Division
University of Edinburgh
tel: 0131 650 4985
email: y...@ed.ac.uk
2012 Computing Services, JCMB
The King's Buildings,
Edinburgh, EH9 3JZ