Dear All,
In order to run a 32-bit program on a 64-bit cluster, one has to build
32-bit OpenMPI. Following some instructions on this mailing list, I
successfully built OpenMPI 1.2.4 on 64-bit OS. However, I run into
openib problem when I try to run hello_c program. I also built 64-bit
OpenMPI from same source. The interesting fact is 64-bit OpenMPI works
just fine. Below is the output from orterun,
############################################################################
iceland:/home/tlin/test_pbs>/home/tin/openmpi-1.2.4/bin/orterun -np 2
--hostfile mach.lst /home/tlin/test_pbs/hello_c.32
--------------------------------------------------------------------------
The OpenIB BTL failed to initialize while trying to create an internal
queue. This typically indicates a failed OpenFabrics installation or
faulty hardware. The failure occured here:
Host: cl1n004
OMPI source: btl_openib.c:828
Function: ibv_create_cq()
Error: Invalid argument (errno=22)
Device: mthca0
You may need to consult with your system administrator to get this
problem fixed.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
The OpenIB BTL failed to initialize while trying to create an internal
queue. This typically indicates a failed OpenFabrics installation or
faulty hardware. The failure occured here:
Host: cl1n001
OMPI source: btl_openib.c:828
Function: ibv_create_cq()
Error: Invalid argument (errno=22)
Device: mthca0
You may need to consult with your system administrator to get this
problem fixed.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or
environment
problems. This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):
PML add procs failed
--> Returned "Error" (-1) instead of "Success" (0)
--------------------------------------------------------------------------
--------------------------------------------------------------------------
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or
environment
problems. This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):
PML add procs failed
--> Returned "Error" (-1) instead of "Success" (0)
--------------------------------------------------------------------------
*** An error occurred in MPI_Init
*** before MPI was initialized
*** MPI_ERRORS_ARE_FATAL (goodbye)
*** An error occurred in MPI_Init
*** before MPI was initialized
*** MPI_ERRORS_ARE_FATAL (goodbye)
######################################################################
I saw this error on before on other cluster. Following the instruction
on (http://www.open-mpi.org/faq/?category=openfabrics#ib-locked-pages)
does fix the problem. However, I doubt that is the reason why 32 bit
OpenMPI does not work on this cluster. Output from limit looks fine to
me. And if that is the case, 64 bit OpenMPI will not work. Any ideas?
Thanks,
Teng