HI Brenda,

I should clarify as my response may confuse folks.  We had configured the
connectx4 cards to use
ethernet/RoCE rather than IB transport for these measurements.

Howard


2016-11-08 16:08 GMT-07:00 Howard Pritchard <hpprit...@gmail.com>:

> Hi Brenda,
>
> What type of ethernet device (is this a Mellanox HCA?) and ethernet switch
> are you using?  The mpirun configure
> options look correct to me.  Is it possible that you have all the mpi
> processes on a single node?
> It should be pretty obvious from the SendRecv IMB test if you're using
> RoCE.  The large message
> bandwidth will be much better than if you are going through the tcp btl.
>
> If you're using Mellanox cards, you might want to do a sanity check using
> the MXM libraries.
> You'd want to set MXM_TLS env. variable to "self,shm,rc".   We got close
> to 90 Gb/sec bandwidth using Connect X-4
> + MXM MTL on a cluster earlier this year.
>
> Howard
>
>
>
> 2016-11-08 15:15 GMT-07:00 Brendan Myers <brendan.my...@soft-forge.com>:
>
>> Hello,
>>
>> I am trying to figure out how I can verify that the OpenMPI traffic is
>> actually being transmitted over my RoCE fabric connecting my cluster.  My
>> MPI job runs quickly and error free but I cannot seem to verify that
>> significant amounts of data is being transferred to the other endpoint in
>> my RoCE fabric.  I am able to see what I believe to be the oob data when I
>> remove the oob exclusion from my command when I analyze my RoCE interface
>> using the tools listed below.
>>
>> Software:
>>
>> ·         CentOS 7.2
>>
>> ·         Open MPI 2.0.1
>>
>> Command:
>>
>> ·         mpirun   --mca btl openib,self,sm --mca oob_tcp_if_exclude
>> eth3 --mca btl_openib_receive_queues P,65536,120,64,32 --mca
>> btl_openib_cpc_include rdmacm -np 4 -hostfile mpi-hosts-ce
>> /usr/local/bin/IMB-MPI1
>>
>> o   Eth3 is my RoCE interface
>>
>> o   The 2 nodes involved RoCE interfaces are defined in my mpi-hosts-ce
>> file
>>
>> Ways I have looked to verify data transference:
>>
>> ·         Through the port counters on my RoCE switch
>>
>> o   Sees data being sent when using ib_write_bw but not when using Open
>> MPI
>>
>> ·         Through ibdump
>>
>> o   Sees data being sent when using ib_write_bw but not when using Open
>> MPI
>>
>> ·         Through Wireshark
>>
>> o   Sees data being sent when using ib_write_bw but not when using Open
>> MPI
>>
>>
>>
>> I do not have much experience with Open MPI and apologize if I have left
>> out necessary information.  I will respond with any data requested.  I
>> appreciate the time spent to read and respond to this.
>>
>>
>>
>>
>>
>> Thank you,
>>
>>
>>
>> Brendan T. W. Myers
>>
>> brendan.my...@soft-forge.com
>>
>> Software Forge Inc
>>
>>
>>
>> _______________________________________________
>> users mailing list
>> users@lists.open-mpi.org
>> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
>>
>
>
_______________________________________________
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Reply via email to