Might be worth trying with --mca btl_openib_cpc_include udcm and see if that
works.
-Nathan
On Aug 23, 2016, at 02:41 AM, "Juan A. Cordero Varelaq"
wrote:
Hi Gilles,
If I run it like this:
mpirun --mca btl ^openib,usnic --mca pml ob1 --mca btl_sm_use_knem 0 -np 5
myscript.sh
it works fine
Hi Gilles,
If I run it like this:
mpirun --mca btl ^openib,usnic --mca pml ob1 --mca btl_sm_use_knem 0 -np
5 myscript.sh
it works fine. Am I using infiniband in this way? However, if I remove
*openib*, I get the *librdmacm: Fatal: unable to open RDMA device*
error. So what would be the most
Juan,
if you want to use infiniband with the openib/btl (i am assuming MXM is
not available on your platform, and you to not want
to use infiniband via usnic/libfabric), you can
mpirun --mca pml ob1 --mca btl ^usnic ...
/* i am pretty sure mpirun ... would do the trick too */
if you get th
Hi Gilles,
so if I use rthe option --mca pml ob1, I use infiniband and it will be
as fast as normal, right?
Thanks
On 22/08/16 14:22, Gilles Gouaillardet wrote:
Juan,
to keep things simple, --mca pml ob1 ensures you are not using mxm
(yet an other way to use infiniband)
IPoIB is unlikely
Juan,
to keep things simple, --mca pml ob1 ensures you are not using mxm
(yet an other way to use infiniband)
IPoIB is unlikely working on your system now, so for inter node
communications, you will use tcp with the interconnect you have (GbE or 10
GbE if you are lucky)
in term of performance, Gb
Hi Gilles,
adding *,usnic* made it work :) --mca pml ob1 would not be then needed.
Does it render mpi very slow if infiniband is disabled (what does --mca
pml pb1?)?
Regarding the version mismatch, everything seems to be right. When only
one version is loaded, I see the PATH and the LD_LIBRA
Juan,
can you try to
mpirun --mca btl ^openib,usnic --mca pml ob1 ...
note this simply disable native infiniband. from a performance point of
view, you should have your sysadmin fix the infiniband fabric.
about the version mismatch, please double check your environment
(e.g. $PATH and $LD_LIBRAR
Dear Ralph,
The existence of the two versions does not seem to be the source of
problems, since they are in different locations. I uninstalled the most
recent version and try again with no luck, getting the same
warnings/errors. However, after a deep search I found a couple of hints,
and exec
The rdma error sounds like something isn’t right with your machine’s Infiniband
installation.
The cross-version problem sounds like you installed both OMPI versions into the
same location - did you do that?? If so, then that might be the root cause of
both problems. You need to install them in