Re: [OMPI users] peformance abnormality with openib and tcp framework

Gilles Gouaillardet Mon, 14 May 2018 19:10:28 -0700

Xie Bin,

According to the man page, -N is equivalent to npernode, which isequivalent to --map-by ppr:N:node.


This is *not* equivalent to -map-by node :

The former packs tasks to the same node, and the latter scatters tasksaccross the nodes



[gilles@login ~]$ mpirun --host n0:2,n1:2 -N 2 --tag-output hostname | sort
[1,0]<stdout>:n0
[1,1]<stdout>:n0
[1,2]<stdout>:n1
[1,3]<stdout>:n1

[gilles@login ~]$ mpirun --host n0:2,n1:2 -np 4 --tag-output -map-bynode hostname | sort

[1,0]<stdout>:n0
[1,1]<stdout>:n1
[1,2]<stdout>:n0
[1,3]<stdout>:n1

I am pretty sure a subnet manager was ran at some point in time (so yourHCA can get their identifier).

/* feel free to reboot your nodes and see if ibstat still shows theadapters as active */

Note you might also use --mca pml ob1 in order to make sure mxm nor ucxare used



Cheers,


Gilles



On 5/15/2018 10:45 AM, Blade Shieh wrote:

Hi, George:
My command lines are:
1) single node

mpirun --allow-run-as-root -mca btl self,tcp(or openib) -mcabtl_tcp_if_include eth2 -mca btl_openib_if_include mlx5_0 -xOMP_NUM_THREADS=2 -n 32 myapp

2) 2-node cluster

mpirun --allow-run-as-root -mca btl ^tcp(or ^openib) -mcabtl_tcp_if_include eth2 -mca btl_openib_if_include mlx5_0 -xOMP_NUM_THREADS=4 -N 16 myapp


In 2nd condition, I used -N, which is equal to --map-by node.

Best regards,
Xie Bin

George Bosilca <bosi...@icl.utk.edu <mailto:bosi...@icl.utk.edu>> 于2018年5月15日周二 02:07写道：


    Shared memory communication is important for multi-core platforms,
    especially when you have multiple processes per node. But this is
    only part of your issue here.

    You haven't specified how your processes will be mapped on your
    resources. As a result rank 0 and 1 will be on the same node, so
    you are testing the shared memory support of whatever BTL you
    allow. In this case the performance will be much better for TCP
    than for IB, simply because you are not using your network, but
    its capacity to move data across memory banks. In such an
    environment, TCP translated to a memcpy plus a system call, which
    is much faster than IB. That being said, it should not matter
    because shared memory is there to cover this case.

    Add "--map-by node" to your mpirun command to measure the
    bandwidth between nodes.

      George.



    On Mon, May 14, 2018 at 5:04 AM, Blade Shieh <bladesh...@gmail.com
    <mailto:bladesh...@gmail.com>> wrote:


        Hi, Nathan:
            Thanks for you reply.
        1) It was my mistake not to notice usage of osu_latency. Now
        it worked well, but still poorer in openib.
        2) I did not use sm or vader because I wanted to check
        performance between tcp and openib. Besides, I will run the
        application in cluster, so vader is not so important.
        3) Of course, I tried you suggestions. I used ^tcp/^openib and
        set btl_openib_if_include to mlx5_0 in a two-node cluster (IB
        direcet-connected). The result did not change -- IB still
        better in MPI benchmark but poorer in my applicaion.

        Best Regards,
        Xie Bin

        _______________________________________________
        users mailing list
        users@lists.open-mpi.org <mailto:users@lists.open-mpi.org>
        https://lists.open-mpi.org/mailman/listinfo/users


    _______________________________________________
    users mailing list
    users@lists.open-mpi.org <mailto:users@lists.open-mpi.org>
    https://lists.open-mpi.org/mailman/listinfo/users



_______________________________________________
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users


_______________________________________________
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Re: [OMPI users] peformance abnormality with openib and tcp framework

Reply via email to