Boris,
Open MPI should automatically detect the infiniband hardware, and use
openib (and *not* tcp) for inter node communications
and a shared memory optimized btl (e.g. sm or vader) for intra node
communications.
note if you "-mca btl openib,self", you tell Open MPI to use the openib
btl between any tasks,
including tasks running on the same node (which is less efficient than
using sm or vader)
at first, i suggest you make sure infiniband is up and running on all
your nodes.
(just run ibstat, at least one port should be listed, state should be
Active, and all nodes should have the same SM lid)
then try to run two tasks on two nodes.
if this does not work, you can
mpirun --mca btl_base_verbose 100 ...
and post the logs so we can investigate from there.
Cheers,
Gilles
On 7/14/2017 6:43 AM, Boris M. Vulovic wrote:
I would like to know how to invoke InfiniBand hardware on CentOS 6x
cluster with OpenMPI (static libs.) for running my C++ code. This is
how I compile and run:
/usr/local/open-mpi/1.10.7/bin/mpic++ -L/usr/local/open-mpi/1.10.7/lib
-Bstatic main.cpp -o DoWork
usr/local/open-mpi/1.10.7/bin/mpiexec -mca btl tcp,self --hostfile
hostfile5 -host node01,node02,node03,node04,node05 -n 200 DoWork
Here, "*-mca btl tcp,self*" reveals that *TCP* is used, and the
cluster has InfiniBand.
What should be changed in compiling and running commands for
InfiniBand to be invoked? If I just replace "*-mca btl tcp,self*" with
"*-mca btl openib,self*" then I get plenty of errors with relevant one
saying:
/At least one pair of MPI processes are unable to reach each other for
MPI communications. This means that no Open MPI device has indicated
that it can be used to communicate between these processes. This is an
error; Open MPI requires that all MPI processes be able to reach each
other. This error can sometimes be the result of forgetting to specify
the "self" BTL./
Thanks very much!!!
*Boris *
_______________________________________________
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users
_______________________________________________
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users