Adam, You can also set btl_tcp_links to 2 or 3 to allow multiple connections between peers, with a potential higher aggregate bandwidth.
George. On Sun, Jul 9, 2017 at 10:04 AM, Adam Sylvester <op8...@gmail.com> wrote: > Gilles, > > Thanks for the fast response! > > The --mca btl_tcp_sndbuf 0 --mca btl_tcp_rcvbuf 0 flags you recommended > made a huge difference - this got me up to 5.7 Gb/s! I wasn't aware of > these flags... with a little Googling, is https://www.open-mpi.org/faq/? > category=tcp the best place to look for this kind of information and any > other tweaks I may want to try (or if there's a better FAQ out there, > please let me know)? > > There is only eth0 on my machines so nothing to tweak there (though good > to know for the future). I also didn't see any improvement by specifying > more sockets per instance. But, your initial suggestion had a major impact. > > In general I try to stay relatively up to date with my Open MPI version; > I'll be extra motivated to upgrade to 2.1.2 so that I don't have to > remember to set these --mca flags on the command line. :o) > > -Adam > > On Sun, Jul 9, 2017 at 9:26 AM, Gilles Gouaillardet < > gilles.gouaillar...@gmail.com> wrote: > >> Adam, >> >> at first, you need to change the default send and receive socket buffers : >> mpirun --mca btl_tcp_sndbuf 0 --mca btl_tcp_rcvbuf 0 ... >> /* note this will be the default from Open MPI 2.1.2 */ >> >> hopefully, that will be enough to greatly improve the bandwidth for >> large messages. >> >> >> generally speaking, i recommend you use the latest (e.g. Open MPI >> 2.1.1) available version >> >> how many interfaces can be used to communicate between hosts ? >> if there is more than one (for example a slow and a fast one), you'd >> rather only use the fast one. >> for example, if eth0 is the fast interface, that can be achieved with >> mpirun --mca btl_tcp_if_include eth0 ... >> >> also, you might be able to achieve better results by using more than >> one socket on the fast interface. >> for example, if you want to use 4 sockets per interface >> mpirun --mca btl_tcp_links 4 ... >> >> >> >> Cheers, >> >> Gilles >> >> On Sun, Jul 9, 2017 at 10:10 PM, Adam Sylvester <op8...@gmail.com> wrote: >> > I am using Open MPI 2.1.0 on RHEL 7. My application has one unavoidable >> > pinch point where a large amount of data needs to be transferred (about >> 8 GB >> > of data needs to be both sent to and received all other ranks), and I'm >> > seeing worse performance than I would expect; this step has a major >> impact >> > on my overall runtime. In the real application, I am using >> MPI_Alltoall() >> > for this step, but for the purpose of a simple benchmark, I simplified >> to >> > simply do a single MPI_Send() / MPI_Recv() between two ranks of a 2 GB >> > buffer. >> > >> > I'm running this in AWS with instances that have 10 Gbps connectivity >> in the >> > same availability zone (according to tracepath, there are no hops >> between >> > them) and MTU set to 8801 bytes. Doing a non-MPI benchmark of sending >> data >> > directly over TCP between these two instances, I reliably get around 4 >> Gbps. >> > Between these same two instances with MPI_Send() / MPI_Recv(), I >> reliably >> > get around 2.4 Gbps. This seems like a major performance degradation >> for a >> > single MPI operation. >> > >> > I compiled Open MPI 2.1.0 with gcc 4.9.1 and default settings. I'm >> > connecting between instances via ssh and using I assume TCP for the >> actual >> > network transfer (I'm not setting any special command-line or >> programmatic >> > settings). The actual command I'm running is: >> > mpirun -N 1 --bind-to none --hostfile hosts.txt my_app >> > >> > Any advice on other things to test or compilation and/or runtime flags >> to >> > set would be much appreciated! >> > -Adam >> > >> > _______________________________________________ >> > users mailing list >> > users@lists.open-mpi.org >> > https://rfd.newmexicoconsortium.org/mailman/listinfo/users >> _______________________________________________ >> users mailing list >> users@lists.open-mpi.org >> https://rfd.newmexicoconsortium.org/mailman/listinfo/users >> > > > _______________________________________________ > users mailing list > users@lists.open-mpi.org > https://rfd.newmexicoconsortium.org/mailman/listinfo/users >
_______________________________________________ users mailing list users@lists.open-mpi.org https://rfd.newmexicoconsortium.org/mailman/listinfo/users