HI Brenda, I should clarify as my response may confuse folks. We had configured the connectx4 cards to use ethernet/RoCE rather than IB transport for these measurements.
Howard 2016-11-08 16:08 GMT-07:00 Howard Pritchard <hpprit...@gmail.com>: > Hi Brenda, > > What type of ethernet device (is this a Mellanox HCA?) and ethernet switch > are you using? The mpirun configure > options look correct to me. Is it possible that you have all the mpi > processes on a single node? > It should be pretty obvious from the SendRecv IMB test if you're using > RoCE. The large message > bandwidth will be much better than if you are going through the tcp btl. > > If you're using Mellanox cards, you might want to do a sanity check using > the MXM libraries. > You'd want to set MXM_TLS env. variable to "self,shm,rc". We got close > to 90 Gb/sec bandwidth using Connect X-4 > + MXM MTL on a cluster earlier this year. > > Howard > > > > 2016-11-08 15:15 GMT-07:00 Brendan Myers <brendan.my...@soft-forge.com>: > >> Hello, >> >> I am trying to figure out how I can verify that the OpenMPI traffic is >> actually being transmitted over my RoCE fabric connecting my cluster. My >> MPI job runs quickly and error free but I cannot seem to verify that >> significant amounts of data is being transferred to the other endpoint in >> my RoCE fabric. I am able to see what I believe to be the oob data when I >> remove the oob exclusion from my command when I analyze my RoCE interface >> using the tools listed below. >> >> Software: >> >> · CentOS 7.2 >> >> · Open MPI 2.0.1 >> >> Command: >> >> · mpirun --mca btl openib,self,sm --mca oob_tcp_if_exclude >> eth3 --mca btl_openib_receive_queues P,65536,120,64,32 --mca >> btl_openib_cpc_include rdmacm -np 4 -hostfile mpi-hosts-ce >> /usr/local/bin/IMB-MPI1 >> >> o Eth3 is my RoCE interface >> >> o The 2 nodes involved RoCE interfaces are defined in my mpi-hosts-ce >> file >> >> Ways I have looked to verify data transference: >> >> · Through the port counters on my RoCE switch >> >> o Sees data being sent when using ib_write_bw but not when using Open >> MPI >> >> · Through ibdump >> >> o Sees data being sent when using ib_write_bw but not when using Open >> MPI >> >> · Through Wireshark >> >> o Sees data being sent when using ib_write_bw but not when using Open >> MPI >> >> >> >> I do not have much experience with Open MPI and apologize if I have left >> out necessary information. I will respond with any data requested. I >> appreciate the time spent to read and respond to this. >> >> >> >> >> >> Thank you, >> >> >> >> Brendan T. W. Myers >> >> brendan.my...@soft-forge.com >> >> Software Forge Inc >> >> >> >> _______________________________________________ >> users mailing list >> users@lists.open-mpi.org >> https://rfd.newmexicoconsortium.org/mailman/listinfo/users >> > >
_______________________________________________ users mailing list users@lists.open-mpi.org https://rfd.newmexicoconsortium.org/mailman/listinfo/users