[OMPI users] How to verify RDMA traffic (RoCE) is being sent over a fabric when running OpenMPI

2016-11-08 Thread Brendan Myers
Hello,

I am trying to figure out how I can verify that the OpenMPI traffic is
actually being transmitted over my RoCE fabric connecting my cluster.  My
MPI job runs quickly and error free but I cannot seem to verify that
significant amounts of data is being transferred to the other endpoint in my
RoCE fabric.  I am able to see what I believe to be the oob data when I
remove the oob exclusion from my command when I analyze my RoCE interface
using the tools listed below.

Software:

* CentOS 7.2

* Open MPI 2.0.1

Command:

* mpirun   --mca btl openib,self,sm --mca oob_tcp_if_exclude eth3
--mca btl_openib_receive_queues P,65536,120,64,32 --mca
btl_openib_cpc_include rdmacm -np 4 -hostfile mpi-hosts-ce
/usr/local/bin/IMB-MPI1

o   Eth3 is my RoCE interface

o   The 2 nodes involved RoCE interfaces are defined in my mpi-hosts-ce file

Ways I have looked to verify data transference:

* Through the port counters on my RoCE switch

o   Sees data being sent when using ib_write_bw but not when using Open MPI

* Through ibdump

o   Sees data being sent when using ib_write_bw but not when using Open MPI

* Through Wireshark

o   Sees data being sent when using ib_write_bw but not when using Open MPI

 

I do not have much experience with Open MPI and apologize if I have left out
necessary information.  I will respond with any data requested.  I
appreciate the time spent to read and respond to this.

 

 

Thank you,

 

Brendan T. W. Myers

brendan.my...@soft-forge.com  

Software Forge Inc

 

___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] How to verify RDMA traffic (RoCE) is being sent over a fabric when running OpenMPI

2016-11-08 Thread Howard Pritchard
Hi Brenda,

What type of ethernet device (is this a Mellanox HCA?) and ethernet switch
are you using?  The mpirun configure
options look correct to me.  Is it possible that you have all the mpi
processes on a single node?
It should be pretty obvious from the SendRecv IMB test if you're using
RoCE.  The large message
bandwidth will be much better than if you are going through the tcp btl.

If you're using Mellanox cards, you might want to do a sanity check using
the MXM libraries.
You'd want to set MXM_TLS env. variable to "self,shm,rc".   We got close to
90 Gb/sec bandwidth using Connect X-4
+ MXM MTL on a cluster earlier this year.

Howard



2016-11-08 15:15 GMT-07:00 Brendan Myers :

> Hello,
>
> I am trying to figure out how I can verify that the OpenMPI traffic is
> actually being transmitted over my RoCE fabric connecting my cluster.  My
> MPI job runs quickly and error free but I cannot seem to verify that
> significant amounts of data is being transferred to the other endpoint in
> my RoCE fabric.  I am able to see what I believe to be the oob data when I
> remove the oob exclusion from my command when I analyze my RoCE interface
> using the tools listed below.
>
> Software:
>
> · CentOS 7.2
>
> · Open MPI 2.0.1
>
> Command:
>
> · mpirun   --mca btl openib,self,sm --mca oob_tcp_if_exclude eth3
> --mca btl_openib_receive_queues P,65536,120,64,32 --mca
> btl_openib_cpc_include rdmacm -np 4 -hostfile mpi-hosts-ce
> /usr/local/bin/IMB-MPI1
>
> o   Eth3 is my RoCE interface
>
> o   The 2 nodes involved RoCE interfaces are defined in my mpi-hosts-ce
> file
>
> Ways I have looked to verify data transference:
>
> · Through the port counters on my RoCE switch
>
> o   Sees data being sent when using ib_write_bw but not when using Open
> MPI
>
> · Through ibdump
>
> o   Sees data being sent when using ib_write_bw but not when using Open
> MPI
>
> · Through Wireshark
>
> o   Sees data being sent when using ib_write_bw but not when using Open
> MPI
>
>
>
> I do not have much experience with Open MPI and apologize if I have left
> out necessary information.  I will respond with any data requested.  I
> appreciate the time spent to read and respond to this.
>
>
>
>
>
> Thank you,
>
>
>
> Brendan T. W. Myers
>
> brendan.my...@soft-forge.com
>
> Software Forge Inc
>
>
>
> ___
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
>
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] How to verify RDMA traffic (RoCE) is being sent over a fabric when running OpenMPI

2016-11-08 Thread Howard Pritchard
HI Brenda,

I should clarify as my response may confuse folks.  We had configured the
connectx4 cards to use
ethernet/RoCE rather than IB transport for these measurements.

Howard


2016-11-08 16:08 GMT-07:00 Howard Pritchard :

> Hi Brenda,
>
> What type of ethernet device (is this a Mellanox HCA?) and ethernet switch
> are you using?  The mpirun configure
> options look correct to me.  Is it possible that you have all the mpi
> processes on a single node?
> It should be pretty obvious from the SendRecv IMB test if you're using
> RoCE.  The large message
> bandwidth will be much better than if you are going through the tcp btl.
>
> If you're using Mellanox cards, you might want to do a sanity check using
> the MXM libraries.
> You'd want to set MXM_TLS env. variable to "self,shm,rc".   We got close
> to 90 Gb/sec bandwidth using Connect X-4
> + MXM MTL on a cluster earlier this year.
>
> Howard
>
>
>
> 2016-11-08 15:15 GMT-07:00 Brendan Myers :
>
>> Hello,
>>
>> I am trying to figure out how I can verify that the OpenMPI traffic is
>> actually being transmitted over my RoCE fabric connecting my cluster.  My
>> MPI job runs quickly and error free but I cannot seem to verify that
>> significant amounts of data is being transferred to the other endpoint in
>> my RoCE fabric.  I am able to see what I believe to be the oob data when I
>> remove the oob exclusion from my command when I analyze my RoCE interface
>> using the tools listed below.
>>
>> Software:
>>
>> · CentOS 7.2
>>
>> · Open MPI 2.0.1
>>
>> Command:
>>
>> · mpirun   --mca btl openib,self,sm --mca oob_tcp_if_exclude
>> eth3 --mca btl_openib_receive_queues P,65536,120,64,32 --mca
>> btl_openib_cpc_include rdmacm -np 4 -hostfile mpi-hosts-ce
>> /usr/local/bin/IMB-MPI1
>>
>> o   Eth3 is my RoCE interface
>>
>> o   The 2 nodes involved RoCE interfaces are defined in my mpi-hosts-ce
>> file
>>
>> Ways I have looked to verify data transference:
>>
>> · Through the port counters on my RoCE switch
>>
>> o   Sees data being sent when using ib_write_bw but not when using Open
>> MPI
>>
>> · Through ibdump
>>
>> o   Sees data being sent when using ib_write_bw but not when using Open
>> MPI
>>
>> · Through Wireshark
>>
>> o   Sees data being sent when using ib_write_bw but not when using Open
>> MPI
>>
>>
>>
>> I do not have much experience with Open MPI and apologize if I have left
>> out necessary information.  I will respond with any data requested.  I
>> appreciate the time spent to read and respond to this.
>>
>>
>>
>>
>>
>> Thank you,
>>
>>
>>
>> Brendan T. W. Myers
>>
>> brendan.my...@soft-forge.com
>>
>> Software Forge Inc
>>
>>
>>
>> ___
>> users mailing list
>> users@lists.open-mpi.org
>> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
>>
>
>
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users