This problem as nothing to do with stdout... Example with 3 processes:
P0 hits barrier at t=12 P1 hits barrier at t=27 P2 hits barrier at t=41 In this situation: P0 waits 41-12 = 29 P1 waits 41-27 = 14 P2 waits 41-41 = 00 So I should see something like (no ordering is expected): barrier_time = 14 barrier_time = 00 barrier_time = 29 But what I see is much more like barrier_time = 22 barrier_time = 29 barrier_time = 25 See? No process has a barrier_time equal to zero !!! Le 8 sept. 2011 à 14:55, Jeff Squyres a écrit : > The order in which you see stdout printed from mpirun is not necessarily > reflective of what order things were actually printers. Remember that the > stdout from each MPI process needs to flow through at least 3 processes and > potentially across the network before it is actually displayed on mpirun's > stdout. > > MPI process -> local Open MPI daemon -> mpirun -> printed to mpirun's stdout > > Hence, the ordering of stdout can get transposed. > > > On Sep 8, 2011, at 8:49 AM, Ghislain Lartigue wrote: > >> Thank you for this explanation but indeed this confirms that the LAST >> process that hits the barrier should go through nearly instantaneously >> (except for the broadcast time for the acknowledgment signal). >> And this is not what happens in my code : EVERY process waits for a very >> long time before going through the barrier (thousands of times more than a >> broadcast)... >> >> >> Le 8 sept. 2011 à 14:26, Jeff Squyres a écrit : >> >>> Order in which processes hit the barrier is only one factor in the time it >>> takes for that process to finish the barrier. >>> >>> An easy way to think of a barrier implementation is a "fan in/fan out" >>> model. When each nonzero rank process calls MPI_BARRIER, it sends a >>> message saying "I have hit the barrier!" (it usually sends it to its parent >>> in a tree of all MPI processes in the communicator, but you can simplify >>> this model and consider that it sends it to rank 0). Rank 0 collects all >>> of these messages. When it has messages from all processes in the >>> communicator, it sends out "ok, you can leave the barrier now" messages >>> (again, it's usually via a tree distribution, but you can pretend that it >>> directly, linearly sends a message to each peer process in the >>> communicator). >>> >>> Hence, the time that any individual process spends in the communicator is >>> relative to when every other process enters the communicator. But it's >>> also dependent upon communication speed, congestion in the network, etc. >>> >>> >>> On Sep 8, 2011, at 6:20 AM, Ghislain Lartigue wrote: >>> >>>> Hello, >>>> >>>> at a given point in my (Fortran90) program, I write: >>>> >>>> =================== >>>> start_time = MPI_Wtime() >>>> call MPI_BARRIER(...) >>>> new_time = MPI_Wtime() - start_time >>>> write(*,*) "barrier time =",new_time >>>> ================== >>>> >>>> and then I run my code... >>>> >>>> I expected that the values of "new_time" would range from 0 to Tmax (1700 >>>> in my case) >>>> As I understand it, the first process that hits the barrier should print >>>> Tmax and the last process that hits the barrier should print 0 (or a very >>>> low value). >>>> >>>> But this is not the case: all processes print values in the range >>>> 1400-1700! >>>> >>>> Any explanation? >>>> >>>> Thanks, >>>> Ghislain. >>>> >>>> PS: >>>> This small code behaves perfectly in other parts of my code... >>>> >>>> >>>> _______________________________________________ >>>> users mailing list >>>> us...@open-mpi.org >>>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>> >>> >>> -- >>> Jeff Squyres >>> jsquy...@cisco.com >>> For corporate legal information go to: >>> http://www.cisco.com/web/about/doing_business/legal/cri/ >>> >>> >>> _______________________________________________ >>> users mailing list >>> us...@open-mpi.org >>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>> >> >> >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users > > > -- > Jeff Squyres > jsquy...@cisco.com > For corporate legal information go to: > http://www.cisco.com/web/about/doing_business/legal/cri/ > > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users >