Re: [OMPI users] peformance abnormality with openib and tcp framework

George Bosilca Mon, 14 May 2018 11:08:49 -0700

Shared memory communication is important for multi-core platforms,
especially when you have multiple processes per node. But this is only part
of your issue here.

You haven't specified how your processes will be mapped on your resources.
As a result rank 0 and 1 will be on the same node, so you are testing the
shared memory support of whatever BTL you allow. In this case the
performance will be much better for TCP than for IB, simply because you are
not using your network, but its capacity to move data across memory banks.
In such an environment, TCP translated to a memcpy plus a system call,
which is much faster than IB. That being said, it should not matter because
shared memory is there to cover this case.

Add "--map-by node" to your mpirun command to measure the bandwidth between
nodes.

  George.

On Mon, May 14, 2018 at 5:04 AM, Blade Shieh <bladesh...@gmail.com> wrote:

>
> Hi, Nathan:
>     Thanks for you reply.
> 1) It was my mistake not to notice usage of osu_latency. Now it worked
> well, but still poorer in openib.
> 2) I did not use sm or vader because I wanted to check performance between
> tcp and openib. Besides, I will run the application in cluster, so vader is
> not so important.
> 3) Of course, I tried you suggestions. I used ^tcp/^openib and set
> btl_openib_if_include to mlx5_0 in a two-node cluster (IB
> direcet-connected).  The result did not change -- IB still better in MPI
> benchmark but poorer in my applicaion.
>
> Best Regards,
> Xie Bin
>
> _______________________________________________
> users mailing list
> users@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/users
>

_______________________________________________
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Re: [OMPI users] peformance abnormality with openib and tcp framework

Reply via email to