Hi, Another interesting point. I noticed that the last two message sizes tested (2MB and 4MB) are lower than expected for both osu_bw and osu_bibw. Increasing the minimum size to use the RDMA pipeline to above these sizes brings those two data-points up to scratch for both benchmarks:
3.0.0, osu_bw, no rdma for large messages > mpirun -mca btl_openib_min_rdma_pipeline_size 4194304 -map-by ppr:1:node -np > 2 -H r6,r7 ./osu_bw -m 2097152:4194304 # OSU MPI Bi-Directional Bandwidth Test v5.4.0 # Size Bandwidth (MB/s) 2097152 6133.22 4194304 6054.06 3.0.0, osu_bibw, eager rdma disabled, no rdma for large messages > mpirun -mca btl_openib_min_rdma_pipeline_size 4194304 -mca > btl_openib_use_eager_rdma 0 -map-by ppr:1:node -np 2 -H r6,r7 ./osu_bibw -m > 2097152:4194304 # OSU MPI Bi-Directional Bandwidth Test v5.4.0 # Size Bandwidth (MB/s) 2097152 11397.85 4194304 11389.64 This makes me think something odd is going on in the RDMA pipeline. Cheers, Ben > On 5 Apr 2018, at 5:03 pm, Ben Menadue <ben.mena...@nci.org.au> wrote: > > Hi, > > We’ve just been running some OSU benchmarks with OpenMPI 3.0.0 and noticed > that osu_bibw gives nowhere near the bandwidth I’d expect (this is on FDR > IB). However, osu_bw is fine. > > If I disable eager RDMA, then osu_bibw gives the expected numbers. Similarly, > if I increase the number of eager RDMA buffers, it gives the expected results. > > OpenMPI 1.10.7 gives consistent, reasonable numbers with default settings, > but they’re not as good as 3.0.0 (when tuned) for large buffers. The same > option changes produce no different in the performance for 1.10.7. > > I was wondering if anyone else has noticed anything similar, and if this is > unexpected, if anyone has a suggestion on how to investigate further? > > Thanks, > Ben > > > Here’s are the numbers: > > 3.0.0, osu_bw, default settings > > > mpirun -map-by ppr:1:node -np 2 -H r6,r7 ./osu_bw > # OSU MPI Bandwidth Test v5.4.0 > # Size Bandwidth (MB/s) > 1 1.13 > 2 2.29 > 4 4.63 > 8 9.21 > 16 18.18 > 32 36.46 > 64 69.95 > 128 128.55 > 256 250.74 > 512 451.54 > 1024 829.44 > 2048 1475.87 > 4096 2119.99 > 8192 3452.37 > 16384 2866.51 > 32768 4048.17 > 65536 5030.54 > 131072 5573.81 > 262144 5861.61 > 524288 6015.15 > 1048576 6099.46 > 2097152 989.82 > 4194304 989.81 > > 3.0.0, osu_bibw, default settings > > > mpirun -map-by ppr:1:node -np 2 -H r6,r7 ./osu_bibw > # OSU MPI Bi-Directional Bandwidth Test v5.4.0 > # Size Bandwidth (MB/s) > 1 0.00 > 2 0.01 > 4 0.01 > 8 0.02 > 16 0.04 > 32 0.09 > 64 0.16 > 128 135.30 > 256 265.35 > 512 499.92 > 1024 949.22 > 2048 1440.27 > 4096 1960.09 > 8192 3166.97 > 16384 127.62 > 32768 165.12 > 65536 312.80 > 131072 1120.03 > 262144 4724.01 > 524288 4545.93 > 1048576 5186.51 > 2097152 989.84 > 4194304 989.88 > > 3.0.0, osu_bibw, eager RDMA disabled > > > mpirun -mca btl_openib_use_eager_rdma 0 -map-by ppr:1:node -np 2 -H r6,r7 > > ./osu_bibw > # OSU MPI Bi-Directional Bandwidth Test v5.4.0 > # Size Bandwidth (MB/s) > 1 1.49 > 2 2.97 > 4 5.96 > 8 11.98 > 16 23.95 > 32 47.39 > 64 93.57 > 128 153.82 > 256 304.69 > 512 572.30 > 1024 1003.52 > 2048 1083.89 > 4096 1879.32 > 8192 2785.18 > 16384 3535.77 > 32768 5614.72 > 65536 8113.69 > 131072 9666.74 > 262144 10738.97 > 524288 11247.02 > 1048576 11416.50 > 2097152 989.88 > 4194304 989.88 > > 3.0.0, osu_bibw, increased eager RDMA buffer count > > > mpirun -mca btl_openib_eager_rdma_num 32768 -map-by ppr:1:node -np 2 -H > > r6,r7 ./osu_bibw > # OSU MPI Bi-Directional Bandwidth Test v5.4.0 > # Size Bandwidth (MB/s) > 1 1.42 > 2 2.84 > 4 5.67 > 8 11.18 > 16 22.46 > 32 44.65 > 64 83.10 > 128 154.00 > 256 291.63 > 512 537.66 > 1024 942.35 > 2048 1433.09 > 4096 2356.40 > 8192 1998.54 > 16384 3584.82 > 32768 5523.08 > 65536 7717.63 > 131072 9419.50 > 262144 10564.77 > 524288 11104.71 > 1048576 11130.75 > 2097152 7943.89 > 4194304 5270.00 > > 1.10.7, osu_bibw, default settings > > > mpirun -map-by ppr:1:node -np 2 -H r6,r7 ./osu_bibw > # OSU MPI Bi-Directional Bandwidth Test v5.4.0 > # Size Bandwidth (MB/s) > 1 1.70 > 2 3.45 > 4 6.95 > 8 13.68 > 16 27.41 > 32 53.80 > 64 105.34 > 128 164.40 > 256 324.63 > 512 623.95 > 1024 1127.35 > 2048 1784.58 > 4096 3305.45 > 8192 3697.55 > 16384 4935.75 > 32768 7186.28 > 65536 8996.94 > 131072 9301.78 > 262144 4691.36 > 524288 7039.18 > 1048576 7213.33 > 2097152 9601.41 > 4194304 9281.31 > >
_______________________________________________ users mailing list users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/users