These "times" have no units, it's just an example...
Whatever units are used, at least one process should spend a very small of time 
in the barrier (compared to the other processes) and this is not what I see in 
my code.

The network is supposed to be excellent: my machine is #9 in the top500 
supercomputers... (http://top500.org/system/10589)

Ghislain.

Le 8 sept. 2011 à 15:34, Jeff Squyres a écrit :

> On Sep 8, 2011, at 9:17 AM, Ghislain Lartigue wrote:
> 
>> Example with 3 processes:
>> 
>> P0 hits barrier at t=12
>> P1 hits barrier at t=27
>> P2 hits barrier at t=41
> 
> What is the unit of time here, and how well are these times synchronized?
> 
>> In this situation:
>> P0 waits 41-12 = 29
>> P1 waits 41-27 = 14
>> P2 waits 41-41 = 00
>> 
>> So I should see something  like (no ordering is expected):
>> barrier_time = 14
>> barrier_time = 00
>> barrier_time = 29
>> 
>> But what I see is much more like
>> barrier_time = 22
>> barrier_time = 29
>> barrier_time = 25
>> 
>> See? No process has a barrier_time equal to zero !!!
> 
> No process will ever have a *zero* time in a barrier; it's just not possible 
> (unless you're measuring in seconds, or something very coarse grained?).
> 
> What type of network are you using?
> 
>> 
>> 
>> Le 8 sept. 2011 à 14:55, Jeff Squyres a écrit :
>> 
>>> The order in which you see stdout printed from mpirun is not necessarily 
>>> reflective of what order things were actually printers.  Remember that the 
>>> stdout from each MPI process needs to flow through at least 3 processes and 
>>> potentially across the network before it is actually displayed on mpirun's 
>>> stdout.
>>> 
>>> MPI process -> local Open MPI daemon -> mpirun -> printed to mpirun's stdout
>>> 
>>> Hence, the ordering of stdout can get transposed.
>>> 
>>> 
>>> On Sep 8, 2011, at 8:49 AM, Ghislain Lartigue wrote:
>>> 
>>>> Thank you for this explanation but indeed this confirms that the LAST 
>>>> process that hits the barrier should go through nearly instantaneously 
>>>> (except for the broadcast time for the acknowledgment signal).
>>>> And this is not what happens in my code : EVERY process waits for a very 
>>>> long time before going through the barrier (thousands of times more than a 
>>>> broadcast)...
>>>> 
>>>> 
>>>> Le 8 sept. 2011 à 14:26, Jeff Squyres a écrit :
>>>> 
>>>>> Order in which processes hit the barrier is only one factor in the time 
>>>>> it takes for that process to finish the barrier.
>>>>> 
>>>>> An easy way to think of a barrier implementation is a "fan in/fan out" 
>>>>> model.  When each nonzero rank process calls MPI_BARRIER, it sends a 
>>>>> message saying "I have hit the barrier!" (it usually sends it to its 
>>>>> parent in a tree of all MPI processes in the communicator, but you can 
>>>>> simplify this model and consider that it sends it to rank 0).  Rank 0 
>>>>> collects all of these messages.  When it has messages from all processes 
>>>>> in the communicator, it sends out "ok, you can leave the barrier now" 
>>>>> messages (again, it's usually via a tree distribution, but you can 
>>>>> pretend that it directly, linearly sends a message to each peer process 
>>>>> in the communicator).
>>>>> 
>>>>> Hence, the time that any individual process spends in the communicator is 
>>>>> relative to when every other process enters the communicator.  But it's 
>>>>> also dependent upon communication speed, congestion in the network, etc.
>>>>> 
>>>>> 
>>>>> On Sep 8, 2011, at 6:20 AM, Ghislain Lartigue wrote:
>>>>> 
>>>>>> Hello,
>>>>>> 
>>>>>> at a given point in my (Fortran90) program, I write:
>>>>>> 
>>>>>> ===================
>>>>>> start_time = MPI_Wtime()
>>>>>> call MPI_BARRIER(...)
>>>>>> new_time = MPI_Wtime() - start_time
>>>>>> write(*,*) "barrier time =",new_time
>>>>>> ==================
>>>>>> 
>>>>>> and then I run my code...
>>>>>> 
>>>>>> I expected that the values of "new_time" would range from 0 to Tmax 
>>>>>> (1700 in my case)
>>>>>> As I understand it, the first process that hits the barrier should print 
>>>>>> Tmax and the last process that hits the barrier should print 0 (or a 
>>>>>> very low value).
>>>>>> 
>>>>>> But this is not the case: all processes print values in the range 
>>>>>> 1400-1700!
>>>>>> 
>>>>>> Any explanation?
>>>>>> 
>>>>>> Thanks,
>>>>>> Ghislain.
>>>>>> 
>>>>>> PS:
>>>>>> This small code behaves perfectly in other parts of my code...
>>>>>> 
>>>>>> 
>>>>>> _______________________________________________
>>>>>> users mailing list
>>>>>> us...@open-mpi.org
>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>> 
>>>>> 
>>>>> -- 
>>>>> Jeff Squyres
>>>>> jsquy...@cisco.com
>>>>> For corporate legal information go to:
>>>>> http://www.cisco.com/web/about/doing_business/legal/cri/
>>>>> 
>>>>> 
>>>>> _______________________________________________
>>>>> users mailing list
>>>>> us...@open-mpi.org
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>> 
>>>> 
>>>> 
>>>> _______________________________________________
>>>> users mailing list
>>>> us...@open-mpi.org
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>> 
>>> 
>>> -- 
>>> Jeff Squyres
>>> jsquy...@cisco.com
>>> For corporate legal information go to:
>>> http://www.cisco.com/web/about/doing_business/legal/cri/
>>> 
>>> 
>>> _______________________________________________
>>> users mailing list
>>> us...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>> 
>> 
>> 
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> 
> -- 
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
> 
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 


Reply via email to