Re: [OMPI users] Problem with MPI_BARRIER

Ghislain Lartigue Thu, 8 Sep 2011 10:25:18 -0400

Thanks,

I understand this but the delays that I measure are huge compared to a 
classical ack procedure... (1000x more)
And this is repeatable: as far as I understand it, this shows that the network 
is not involved.


Ghislain.


Le 8 sept. 2011 à 16:16, Teng Ma a écrit :

> I guess you forget to count the "leaving time"(fan-out).  When everyone
> hits the barrier, it still needs "ack" to leave.  And remember in most
> cases, leader process will send out "acks" in a sequence way.  It's very
> possible:
> 
> P0 barrier time = 29 + send/recv ack 0
> P1 barrier time = 14 + send ack 0  + send/recv ack 1
> P2 barrier time = 0 + send ack 0 + send ack 1 + send/recv ack 2
> 
> That's your measure time.
> 
> Teng
>> This problem as nothing to do with stdout...
>> 
>> Example with 3 processes:
>> 
>> P0 hits barrier at t=12
>> P1 hits barrier at t=27
>> P2 hits barrier at t=41
>> 
>> In this situation:
>> P0 waits 41-12 = 29
>> P1 waits 41-27 = 14
>> P2 waits 41-41 = 00
> 
> 
> 
>> So I should see something  like (no ordering is expected):
>> barrier_time = 14
>> barrier_time = 00
>> barrier_time = 29
>> 
>> But what I see is much more like
>> barrier_time = 22
>> barrier_time = 29
>> barrier_time = 25
>> 
>> See? No process has a barrier_time equal to zero !!!
>> 
>> 
>> 
>> Le 8 sept. 2011 à 14:55, Jeff Squyres a écrit :
>> 
>>> The order in which you see stdout printed from mpirun is not necessarily
>>> reflective of what order things were actually printers.  Remember that
>>> the stdout from each MPI process needs to flow through at least 3
>>> processes and potentially across the network before it is actually
>>> displayed on mpirun's stdout.
>>> 
>>> MPI process -> local Open MPI daemon -> mpirun -> printed to mpirun's
>>> stdout
>>> 
>>> Hence, the ordering of stdout can get transposed.
>>> 
>>> 
>>> On Sep 8, 2011, at 8:49 AM, Ghislain Lartigue wrote:
>>> 
>>>> Thank you for this explanation but indeed this confirms that the LAST
>>>> process that hits the barrier should go through nearly instantaneously
>>>> (except for the broadcast time for the acknowledgment signal).
>>>> And this is not what happens in my code : EVERY process waits for a
>>>> very long time before going through the barrier (thousands of times
>>>> more than a broadcast)...
>>>> 
>>>> 
>>>> Le 8 sept. 2011 à 14:26, Jeff Squyres a écrit :
>>>> 
>>>>> Order in which processes hit the barrier is only one factor in the
>>>>> time it takes for that process to finish the barrier.
>>>>> 
>>>>> An easy way to think of a barrier implementation is a "fan in/fan out"
>>>>> model.  When each nonzero rank process calls MPI_BARRIER, it sends a
>>>>> message saying "I have hit the barrier!" (it usually sends it to its
>>>>> parent in a tree of all MPI processes in the communicator, but you can
>>>>> simplify this model and consider that it sends it to rank 0).  Rank 0
>>>>> collects all of these messages.  When it has messages from all
>>>>> processes in the communicator, it sends out "ok, you can leave the
>>>>> barrier now" messages (again, it's usually via a tree distribution,
>>>>> but you can pretend that it directly, linearly sends a message to each
>>>>> peer process in the communicator).
>>>>> 
>>>>> Hence, the time that any individual process spends in the communicator
>>>>> is relative to when every other process enters the communicator.  But
>>>>> it's also dependent upon communication speed, congestion in the
>>>>> network, etc.
>>>>> 
>>>>> 
>>>>> On Sep 8, 2011, at 6:20 AM, Ghislain Lartigue wrote:
>>>>> 
>>>>>> Hello,
>>>>>> 
>>>>>> at a given point in my (Fortran90) program, I write:
>>>>>> 
>>>>>> ===================
>>>>>> start_time = MPI_Wtime()
>>>>>> call MPI_BARRIER(...)
>>>>>> new_time = MPI_Wtime() - start_time
>>>>>> write(*,*) "barrier time =",new_time
>>>>>> ==================
>>>>>> 
>>>>>> and then I run my code...
>>>>>> 
>>>>>> I expected that the values of "new_time" would range from 0 to Tmax
>>>>>> (1700 in my case)
>>>>>> As I understand it, the first process that hits the barrier should
>>>>>> print Tmax and the last process that hits the barrier should print 0
>>>>>> (or a very low value).
>>>>>> 
>>>>>> But this is not the case: all processes print values in the range
>>>>>> 1400-1700!
>>>>>> 
>>>>>> Any explanation?
>>>>>> 
>>>>>> Thanks,
>>>>>> Ghislain.
>>>>>> 
>>>>>> PS:
>>>>>> This small code behaves perfectly in other parts of my code...
>>>>>> 
>>>>>> 
>>>>>> _______________________________________________
>>>>>> users mailing list
>>>>>> us...@open-mpi.org
>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>> 
>>>>> 
>>>>> --
>>>>> Jeff Squyres
>>>>> jsquy...@cisco.com
>>>>> For corporate legal information go to:
>>>>> http://www.cisco.com/web/about/doing_business/legal/cri/
>>>>> 
>>>>> 
>>>>> _______________________________________________
>>>>> users mailing list
>>>>> us...@open-mpi.org
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>> 
>>>> 
>>>> 
>>>> _______________________________________________
>>>> users mailing list
>>>> us...@open-mpi.org
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>> 
>>> 
>>> --
>>> Jeff Squyres
>>> jsquy...@cisco.com
>>> For corporate legal information go to:
>>> http://www.cisco.com/web/about/doing_business/legal/cri/
>>> 
>>> 
>>> _______________________________________________
>>> users mailing list
>>> us...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>> 
>> 
>> 
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> 
> 
> 
> | Teng Ma          Univ. of Tennessee |
> | t...@cs.utk.edu        Knoxville, TN |
> | http://web.eecs.utk.edu/~tma/       |
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>

Re: [OMPI users] Problem with MPI_BARRIER

Reply via email to