Which version of OMPI were you testing?
> On Nov 3, 2014, at 9:14 AM, Steven Eliuk <s.el...@samsung.com> wrote:
>
> Hello,
>
> We were using OpenMPI for some testing, everything works fine but randomly,
> MPI_Ibcast()
> takes long time to finish. We have a standalone program just to test it. The
> following
> is the profiling results of the simple test program on our cluster:
>
> Ibcast 604 mb takes 103 ms
> Ibcast 608 mb takes 106 ms
> Ibcast 612 mb takes 105 ms
> Ibcast 616 mb takes 105 ms
> Ibcast 620 mb takes 107 ms
> Ibcast 624 mb takes 107 ms
> Ibcast 628 mb takes 108 ms
> Ibcast 632 mb takes 110 ms
> Ibcast 636 mb takes 110 ms
> Ibcast 640 mb takes 7437 ms
> Ibcast 644 mb takes 115 ms
> Ibcast 648 mb takes 111 ms
> Ibcast 652 mb takes 112 ms
> Ibcast 656 mb takes 112 ms
> Ibcast 660 mb takes 114 ms
> Ibcast 664 mb takes 114 ms
> Ibcast 668 mb takes 115 ms
> Ibcast 672 mb takes 116 ms
> Ibcast 676 mb takes 116 ms
> Ibcast 680 mb takes 116 ms
> Ibcast 684 mb takes 122 ms
> Ibcast 688 mb takes 7385 ms
> Ibcast 692 mb takes 8729 ms
> Ibcast 696 mb takes 120 ms
> Ibcast 700 mb takes 124 ms
> Ibcast 704 mb takes 121 ms
> Ibcast 708 mb takes 8240 ms
> Ibcast 712 mb takes 122 ms
> Ibcast 716 mb takes 123 ms
> Ibcast 720 mb takes 123 ms
> Ibcast 724 mb takes 124 ms
> Ibcast 728 mb takes 125 ms
> Ibcast 732 mb takes 125 ms
> Ibcast 736 mb takes 126 ms
>
> As you can see, Ibcast takes a long to finish and it's totally random.
> The same program was compiled and tested with MVAPICH2-gdr but it went
> smoothly.
> Both tests were running exclusively on our four nodes cluster without
> contention. Likewise, it doesn't matter
> if I enable CUDA support or not. The followings are the configuration of our
> server:
>
> We have four nodes in this test, each with one K40 GPU and connected with
> mellanox IB.
>
> Please find attached config details and some sample code…
>
> Kindest Regards,
> —
> Steven Eliuk, Ph.D. Comp Sci,
> Advanced Software Platforms Lab,
> SRA - SV,
> Samsung Electronics,
> 1732 North First Street,
> San Jose, CA 95112,
> Work: +1 408-652-1976,
> Work: +1 408-544-5781 Wednesdays,
> Cell: +1 408-819-4407.
>
> <Ibcast_config_details.txt.zip><Ibcast_SampleCode.cpp>_______________________________________________
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post:
> http://www.open-mpi.org/community/lists/users/2014/11/25662.php