No, I have not tried multi-link.
On Mon, Apr 21, 2014 at 11:50 PM, George Bosilca <bosi...@icl.utk.edu>wrote: > Have you tried the multi-link? Did it helped? > > George. > > > On Apr 21, 2014, at 10:34 , Muhammad Ansar Javed < > muhammad.an...@seecs.edu.pk> wrote: > > I am able to achieve around 90% ( maximum 9390 Mbps) bandwidth on 10GE. > There were configuration issues disabling Intel Speedstep and Interrupt > coalescing helped in achieving expected network bandwidth. Varying send and > recv buffer sizes from 128 KB to 1 MB added just 50 Mbps with maximum > bandwidth achieved on 1 MB buffer size. > Thanks for support. > > > On Thu, Apr 17, 2014 at 6:05 AM, George Bosilca <bosi...@icl.utk.edu>wrote: > >> Muhammad, >> >> Our configuration of TCP is tailored for 1Gbs networks, so it’s >> performance on 10G might be sub-optimal. That being said, the remaining of >> this email will be speculation as I do not have access to a 10G system to >> test it. >> >> There are two things that I would test to see if I can improve the >> performance. >> >> 1. The send and receive TCP suffer. These are handled by >> the btl_tcp_sndbuf and btl_tcp_rcvbuf. By default these are set to 128K >> which is extremely small for a 10G network. Try 256KB or maybe even 1M (you >> might need to fiddle with your kernel to get here). >> >> 2. Add more links between the processes by increasing the default value >> for btl_tcp_links to 2 or 4. >> >> You might also try to the following (but here I’m more skeptical). Try >> pushing the value of btl_tcp_endpoint_cache up. This parameter is not to be >> used eagerly in real applications with a complete communication pattern, >> but for a benchmark it might be a good use. >> >> George. >> >> On Apr 16, 2014, at 06:30 , Muhammad Ansar Javed < >> muhammad.an...@seecs.edu.pk> wrote: >> >> Hi Ralph, >> Yes, you are right. I should have also tested NetPipe-MPI version >> earlier. I ran NetPipe-MPI version on 10G Ethernet and maximum bandwidth >> achieved is 5872 Mbps. Moreover, maximum bandwidth achieved by osu_bw test >> is 6080 Mbps. I have used OSU-Micro-Benchmarks version 4.3. >> >> >> On Wed, Apr 16, 2014 at 3:40 PM, Ralph Castain <r...@open-mpi.org> wrote: >> >>> I apologize, but I am now confused. Let me see if I can translate: >>> >>> * you ran the non-MPI version of the NetPipe benchmark and got 9.5Gps on >>> a 10Gps network >>> >>> * you ran iperf and got 9.61Gps - however, this has nothing to do with >>> MPI. Just tests your TCP stack >>> >>> * you tested your bandwidth program on a 1Gps network and got about 90% >>> efficiency. >>> >>> Is the above correct? If so, my actual suggestion was to run the MPI >>> version of NetPipe and to use the OSB benchmark program as well. Your >>> program might well be okay, but benchmarking is a hard thing to get right >>> in a parallel world, so you might as well validate it by cross-checking the >>> result. >>> >>> I suggest this mostly because your performance numbers are far worse >>> than anything we've measured using those standard benchmarks, and so we >>> should first ensure we aren't chasing a ghost. >>> >>> >>> >>> >>> >>> On Wed, Apr 16, 2014 at 1:41 AM, Muhammad Ansar Javed < >>> muhammad.an...@seecs.edu.pk> wrote: >>> >>>> Yes, I have tried NetPipe-Java and iperf for bandwidth and >>>> configuration test. NetPipe Java achieves maximum 9.40 Gbps while iperf >>>> achieves maximum 9.61 Gbps bandwidth. I have also tested my bandwidth >>>> program on 1Gbps Ethernet connection and it achieves 901 Mbps bandwidth. I >>>> am using the same program for 10G network benchmarks. Please find attached >>>> source file of bandwidth program. >>>> >>>> As far as --bind-to core is concerned, I think it is working fine. Here >>>> is output of --report-bindings switch. >>>> [host3:07134] MCW rank 0 bound to socket 0[core 0[hwt 0]]: [B/././.] >>>> [host4:10282] MCW rank 1 bound to socket 0[core 0[hwt 0]]: [B/././.] >>>> >>>> >>>> >>>> >>>> On Tue, Apr 15, 2014 at 8:39 PM, Ralph Castain <r...@open-mpi.org>wrote: >>>> >>>>> Have you tried a typical benchmark (e.g., NetPipe or OMB) to ensure >>>>> the problem isn't in your program? Outside of that, you might want to >>>>> explicitly tell it to --bind-to core just to be sure it does so - it's >>>>> supposed to do that by default, but might as well be sure. You can check >>>>> by >>>>> adding --report-binding to the cmd line. >>>>> >>>>> >>>>> On Apr 14, 2014, at 11:10 PM, Muhammad Ansar Javed < >>>>> muhammad.an...@seecs.edu.pk> wrote: >>>>> >>>>> Hi, >>>>> I am trying to benchmark Open MPI performance on 10G Ethernet network >>>>> between two hosts. The performance numbers of benchmarks are less than >>>>> expected. The maximum bandwidth achieved by OMPI-C is 5678 Mbps and I was >>>>> expecting around 9000+ Mbps. Moreover latency is also quite higher than >>>>> expected, ranging from 37 to 59 us. Here is complete set of numbers. >>>>> >>>>> >>>>> >>>>> *LatencyOpen MPI C Size Time (us)* >>>>> 1 37.76 >>>>> 2 37.75 >>>>> 4 37.78 >>>>> 8 55.17 >>>>> 16 37.89 >>>>> 32 39.08 >>>>> 64 37.78 >>>>> 128 59.46 >>>>> 256 39.37 >>>>> 512 40.39 >>>>> 1024 47.18 >>>>> 2048 47.84 >>>>> >>>>> >>>>> >>>>> >>>>> *BandwidthOpen MPI C Size (Bytes) Bandwidth (Mbps)* >>>>> 2048 412.22 >>>>> 4096 539.59 >>>>> 8192 827.73 >>>>> 16384 1655.35 >>>>> 32768 3274.3 >>>>> 65536 1995.22 >>>>> 131072 3270.84 >>>>> 262144 4316.22 >>>>> 524288 5019.46 >>>>> 1048576 5236.17 >>>>> 2097152 5362.61 >>>>> 4194304 5495.2 >>>>> 8388608 5565.32 >>>>> 16777216 5678.32 >>>>> >>>>> >>>>> My environments consists of two hosts having point-to-point >>>>> (switch-less) 10Gbps Ethernet connection. Environment (OS, user, >>>>> directory >>>>> structure etc) on both hosts is exactly same. There is no NAS or shared >>>>> file system between both hosts. Following are configuration and job >>>>> launching commands that I am using. Moreover, I have attached output of >>>>> script ompi_info --all. >>>>> >>>>> Configuration commmand: ./configure --enable-mpi-java >>>>> --prefix=/home/mpj/installed/openmpi_installed CC=/usr/bin/gcc >>>>> --disable-mpi-fortran >>>>> >>>>> Job launching command: mpirun -np 2 -hostfile machines -npernode 1 >>>>> ./latency.out >>>>> >>>>> Are these numbers okay? If not then please suggest performance tuning >>>>> steps... >>>>> >>>>> Thanks >>>>> >>>>> -- >>>>> Ansar Javed >>>>> HPC Lab >>>>> SEECS NUST >>>>> Contact: +92 334 438 9394 >>>>> Email: muhammad.an...@seecs.edu.pk >>>>> <ompi_info.tar.bz2>_______________________________________________ >>>>> users mailing list >>>>> us...@open-mpi.org >>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>>>> >>>>> >>>>> >>>>> _______________________________________________ >>>>> users mailing list >>>>> us...@open-mpi.org >>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>>>> >>>> >>>> >>>> >>>> -- >>>> Regards >>>> >>>> >>>> Ansar Javed >>>> HPC Lab >>>> SEECS NUST >>>> Contact: +92 334 438 9394 >>>> Email: muhammad.an...@seecs.edu.pk >>>> >>>> _______________________________________________ >>>> users mailing list >>>> us...@open-mpi.org >>>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>>> >>> >>> >>> _______________________________________________ >>> users mailing list >>> us...@open-mpi.org >>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>> >> >> >> >> -- >> Regards >> >> Ansar Javed >> HPC Lab >> SEECS NUST >> Contact: +92 334 438 9394 >> Email: muhammad.an...@seecs.edu.pk >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users >> >> >> >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users >> > > > > -- > Regards > > Ansar Javed > HPC Lab > SEECS NUST > Contact: +92 334 438 9394 > Email: muhammad.an...@seecs.edu.pk > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users > > > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users > -- Regards Ansar Javed HPC Lab SEECS NUST Contact: +92 334 438 9394 Email: muhammad.an...@seecs.edu.pk