These "times" have no units, it's just an example... Whatever units are used, at least one process should spend a very small of time in the barrier (compared to the other processes) and this is not what I see in my code.
The network is supposed to be excellent: my machine is #9 in the top500 supercomputers... (http://top500.org/system/10589) Ghislain. Le 8 sept. 2011 à 15:34, Jeff Squyres a écrit : > On Sep 8, 2011, at 9:17 AM, Ghislain Lartigue wrote: > >> Example with 3 processes: >> >> P0 hits barrier at t=12 >> P1 hits barrier at t=27 >> P2 hits barrier at t=41 > > What is the unit of time here, and how well are these times synchronized? > >> In this situation: >> P0 waits 41-12 = 29 >> P1 waits 41-27 = 14 >> P2 waits 41-41 = 00 >> >> So I should see something like (no ordering is expected): >> barrier_time = 14 >> barrier_time = 00 >> barrier_time = 29 >> >> But what I see is much more like >> barrier_time = 22 >> barrier_time = 29 >> barrier_time = 25 >> >> See? No process has a barrier_time equal to zero !!! > > No process will ever have a *zero* time in a barrier; it's just not possible > (unless you're measuring in seconds, or something very coarse grained?). > > What type of network are you using? > >> >> >> Le 8 sept. 2011 à 14:55, Jeff Squyres a écrit : >> >>> The order in which you see stdout printed from mpirun is not necessarily >>> reflective of what order things were actually printers. Remember that the >>> stdout from each MPI process needs to flow through at least 3 processes and >>> potentially across the network before it is actually displayed on mpirun's >>> stdout. >>> >>> MPI process -> local Open MPI daemon -> mpirun -> printed to mpirun's stdout >>> >>> Hence, the ordering of stdout can get transposed. >>> >>> >>> On Sep 8, 2011, at 8:49 AM, Ghislain Lartigue wrote: >>> >>>> Thank you for this explanation but indeed this confirms that the LAST >>>> process that hits the barrier should go through nearly instantaneously >>>> (except for the broadcast time for the acknowledgment signal). >>>> And this is not what happens in my code : EVERY process waits for a very >>>> long time before going through the barrier (thousands of times more than a >>>> broadcast)... >>>> >>>> >>>> Le 8 sept. 2011 à 14:26, Jeff Squyres a écrit : >>>> >>>>> Order in which processes hit the barrier is only one factor in the time >>>>> it takes for that process to finish the barrier. >>>>> >>>>> An easy way to think of a barrier implementation is a "fan in/fan out" >>>>> model. When each nonzero rank process calls MPI_BARRIER, it sends a >>>>> message saying "I have hit the barrier!" (it usually sends it to its >>>>> parent in a tree of all MPI processes in the communicator, but you can >>>>> simplify this model and consider that it sends it to rank 0). Rank 0 >>>>> collects all of these messages. When it has messages from all processes >>>>> in the communicator, it sends out "ok, you can leave the barrier now" >>>>> messages (again, it's usually via a tree distribution, but you can >>>>> pretend that it directly, linearly sends a message to each peer process >>>>> in the communicator). >>>>> >>>>> Hence, the time that any individual process spends in the communicator is >>>>> relative to when every other process enters the communicator. But it's >>>>> also dependent upon communication speed, congestion in the network, etc. >>>>> >>>>> >>>>> On Sep 8, 2011, at 6:20 AM, Ghislain Lartigue wrote: >>>>> >>>>>> Hello, >>>>>> >>>>>> at a given point in my (Fortran90) program, I write: >>>>>> >>>>>> =================== >>>>>> start_time = MPI_Wtime() >>>>>> call MPI_BARRIER(...) >>>>>> new_time = MPI_Wtime() - start_time >>>>>> write(*,*) "barrier time =",new_time >>>>>> ================== >>>>>> >>>>>> and then I run my code... >>>>>> >>>>>> I expected that the values of "new_time" would range from 0 to Tmax >>>>>> (1700 in my case) >>>>>> As I understand it, the first process that hits the barrier should print >>>>>> Tmax and the last process that hits the barrier should print 0 (or a >>>>>> very low value). >>>>>> >>>>>> But this is not the case: all processes print values in the range >>>>>> 1400-1700! >>>>>> >>>>>> Any explanation? >>>>>> >>>>>> Thanks, >>>>>> Ghislain. >>>>>> >>>>>> PS: >>>>>> This small code behaves perfectly in other parts of my code... >>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> users mailing list >>>>>> us...@open-mpi.org >>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>>>> >>>>> >>>>> -- >>>>> Jeff Squyres >>>>> jsquy...@cisco.com >>>>> For corporate legal information go to: >>>>> http://www.cisco.com/web/about/doing_business/legal/cri/ >>>>> >>>>> >>>>> _______________________________________________ >>>>> users mailing list >>>>> us...@open-mpi.org >>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>>>> >>>> >>>> >>>> _______________________________________________ >>>> users mailing list >>>> us...@open-mpi.org >>>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>> >>> >>> -- >>> Jeff Squyres >>> jsquy...@cisco.com >>> For corporate legal information go to: >>> http://www.cisco.com/web/about/doing_business/legal/cri/ >>> >>> >>> _______________________________________________ >>> users mailing list >>> us...@open-mpi.org >>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>> >> >> >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users > > > -- > Jeff Squyres > jsquy...@cisco.com > For corporate legal information go to: > http://www.cisco.com/web/about/doing_business/legal/cri/ > > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users >