Re: [OMPI users] Broadcast faster than barrier

2016-05-30 Thread Saliya Ekanayake
Thank you, Gilles and Jeff. This makes a lot of sense now. And, Jeff, I thnk the paper you mentioned is this http://ieeexplore.ieee.org/xpl/login.jsp?tp=&arnumber=5184825&url=http%3A%2F%2Fieeexplore.ieee.org%2Fxpls%2Fabs_all.jsp%3Farnumber%3D5184825 ? Thank you, Slaiya On Mon, May 30, 2016 at 9:

Re: [OMPI users] Broadcast faster than barrier

2016-05-30 Thread Jeff Hammond
> So, you mean that it guarantees the value received after the bcast call is > consistent with value sent from root, but it doesn't have to wait till all > the ranks have received it? > > this is what i believe, double checking the standard might not hurt though > ... > No function has barrier sem

Re: [OMPI users] Broadcast faster than barrier

2016-05-30 Thread Gilles Gouaillardet
On 5/30/2016 11:09 PM, Saliya Ekanayake wrote: So, you mean that it guarantees the value received after the bcast call is consistent with value sent from root, but it doesn't have to wait till all the ranks have received it? this is what i believe, double checking the standard might not hurt

Re: [OMPI users] Broadcast faster than barrier

2016-05-30 Thread Saliya Ekanayake
So, you mean that it guarantees the value received after the bcast call is consistent with value sent from root, but it doesn't have to wait till all the ranks have received it? Still, in this benchmark shouldn't the max time for bcast be equal to that of barrier? On Mon, May 30, 2016 at 9:33 AM,

Re: [OMPI users] Broadcast faster than barrier

2016-05-30 Thread Gilles Gouaillardet
These are very different algorithms, so performance might differ (greatly) for example, MPI_Bcast on root rank can MPI_Send() and return, if the message is short, this is likely an eager send which is very fast. that means MPI_Bcast() returns before all ranks received the data, or even entered MPI

Re: [OMPI users] Broadcast faster than barrier

2016-05-30 Thread Saliya Ekanayake
These were taken using OSU Micro benchmarks 5.3 http://mvapich.cse.ohio-state.edu/benchmarks/ For example, in a cluster with 32 nodes each running 24 processes, Broadcast for bytes 1 to 64 take around 36 us where as the barrier takes 165 us. These were on 40Gbps Infiniband # OSU MPI Broadcast La

Re: [OMPI users] Broadcast faster than barrier

2016-05-30 Thread Dorier, Matthieu
Hi, How are you measuring these times? Thanks, Matthieu From: users [users-boun...@open-mpi.org] on behalf of Saliya Ekanayake [esal...@gmail.com] Sent: Monday, May 30, 2016 7:53 AM To: Open MPI Users Subject: [OMPI users] Broadcast faster than barrier Hi, I