Hi Nathan, Howard,
Thanks for the feedback. Yes, we do already have UCX compiled in to our OpenMPI
installations, but it’s disabled by default on our system because some users
were reporting problems with it previously. But I’m not sure what the status of
these are with OpenMPI 3.0, something f
Honestly, this is a configuration issue with the openib btl. There is no reason to keep
either eager RDMA nor is there a reason to pipeline RDMA. I haven't found an app where
either of these "features" helps you with infiniband. You have the right idea
with the parameter changes but Howard is
Hello Ben,
Thanks for the info. You would probably be better off installing UCX on
your cluster and rebuilding your Open MPI with the
--with-ucx
configure option.
Here's what I'm seeing with Open MPI 3.0.1 on a ConnectX5 based cluster
using ob1/openib BTL:
mpirun -map-by ppr:1:node -np 2 ./osu
Hi,
Another interesting point. I noticed that the last two message sizes tested
(2MB and 4MB) are lower than expected for both osu_bw and osu_bibw. Increasing
the minimum size to use the RDMA pipeline to above these sizes brings those two
data-points up to scratch for both benchmarks:
3.0.0, o
Hi,
We’ve just been running some OSU benchmarks with OpenMPI 3.0.0 and noticed that
osu_bibw gives nowhere near the bandwidth I’d expect (this is on FDR IB).
However, osu_bw is fine.
If I disable eager RDMA, then osu_bibw gives the expected numbers. Similarly,
if I increase the number of eager