I guess you forget to count the "leaving time"(fan-out). When everyone hits the barrier, it still needs "ack" to leave. And remember in most cases, leader process will send out "acks" in a sequence way. It's very possible:
P0 barrier time = 29 + send/recv ack 0 P1 barrier time = 14 + send ack 0 + send/recv ack 1 P2 barrier time = 0 + send ack 0 + send ack 1 + send/recv ack 2 That's your measure time. Teng > This problem as nothing to do with stdout... > > Example with 3 processes: > > P0 hits barrier at t=12 > P1 hits barrier at t=27 > P2 hits barrier at t=41 > > In this situation: > P0 waits 41-12 = 29 > P1 waits 41-27 = 14 > P2 waits 41-41 = 00 > So I should see something like (no ordering is expected): > barrier_time = 14 > barrier_time = 00 > barrier_time = 29 > > But what I see is much more like > barrier_time = 22 > barrier_time = 29 > barrier_time = 25 > > See? No process has a barrier_time equal to zero !!! > > > > Le 8 sept. 2011 à 14:55, Jeff Squyres a écrit : > >> The order in which you see stdout printed from mpirun is not necessarily >> reflective of what order things were actually printers. Remember that >> the stdout from each MPI process needs to flow through at least 3 >> processes and potentially across the network before it is actually >> displayed on mpirun's stdout. >> >> MPI process -> local Open MPI daemon -> mpirun -> printed to mpirun's >> stdout >> >> Hence, the ordering of stdout can get transposed. >> >> >> On Sep 8, 2011, at 8:49 AM, Ghislain Lartigue wrote: >> >>> Thank you for this explanation but indeed this confirms that the LAST >>> process that hits the barrier should go through nearly instantaneously >>> (except for the broadcast time for the acknowledgment signal). >>> And this is not what happens in my code : EVERY process waits for a >>> very long time before going through the barrier (thousands of times >>> more than a broadcast)... >>> >>> >>> Le 8 sept. 2011 à 14:26, Jeff Squyres a écrit : >>> >>>> Order in which processes hit the barrier is only one factor in the >>>> time it takes for that process to finish the barrier. >>>> >>>> An easy way to think of a barrier implementation is a "fan in/fan out" >>>> model. When each nonzero rank process calls MPI_BARRIER, it sends a >>>> message saying "I have hit the barrier!" (it usually sends it to its >>>> parent in a tree of all MPI processes in the communicator, but you can >>>> simplify this model and consider that it sends it to rank 0). Rank 0 >>>> collects all of these messages. When it has messages from all >>>> processes in the communicator, it sends out "ok, you can leave the >>>> barrier now" messages (again, it's usually via a tree distribution, >>>> but you can pretend that it directly, linearly sends a message to each >>>> peer process in the communicator). >>>> >>>> Hence, the time that any individual process spends in the communicator >>>> is relative to when every other process enters the communicator. But >>>> it's also dependent upon communication speed, congestion in the >>>> network, etc. >>>> >>>> >>>> On Sep 8, 2011, at 6:20 AM, Ghislain Lartigue wrote: >>>> >>>>> Hello, >>>>> >>>>> at a given point in my (Fortran90) program, I write: >>>>> >>>>> =================== >>>>> start_time = MPI_Wtime() >>>>> call MPI_BARRIER(...) >>>>> new_time = MPI_Wtime() - start_time >>>>> write(*,*) "barrier time =",new_time >>>>> ================== >>>>> >>>>> and then I run my code... >>>>> >>>>> I expected that the values of "new_time" would range from 0 to Tmax >>>>> (1700 in my case) >>>>> As I understand it, the first process that hits the barrier should >>>>> print Tmax and the last process that hits the barrier should print 0 >>>>> (or a very low value). >>>>> >>>>> But this is not the case: all processes print values in the range >>>>> 1400-1700! >>>>> >>>>> Any explanation? >>>>> >>>>> Thanks, >>>>> Ghislain. >>>>> >>>>> PS: >>>>> This small code behaves perfectly in other parts of my code... >>>>> >>>>> >>>>> _______________________________________________ >>>>> users mailing list >>>>> us...@open-mpi.org >>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>>> >>>> >>>> -- >>>> Jeff Squyres >>>> jsquy...@cisco.com >>>> For corporate legal information go to: >>>> http://www.cisco.com/web/about/doing_business/legal/cri/ >>>> >>>> >>>> _______________________________________________ >>>> users mailing list >>>> us...@open-mpi.org >>>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>>> >>> >>> >>> _______________________________________________ >>> users mailing list >>> us...@open-mpi.org >>> http://www.open-mpi.org/mailman/listinfo.cgi/users >> >> >> -- >> Jeff Squyres >> jsquy...@cisco.com >> For corporate legal information go to: >> http://www.cisco.com/web/about/doing_business/legal/cri/ >> >> >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users >> > > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users > | Teng Ma Univ. of Tennessee | | t...@cs.utk.edu Knoxville, TN | | http://web.eecs.utk.edu/~tma/ |