Eugene,

you did not take into account the dispersion/dephasing between different 
processes. As cluster size and the 
number of instances of parallel process increase, the dispersion increases as 
well, making different instances 
to be a kind out of sync - not really out of sync, but just because of 
different speed of execution on different nodes, delays, etc... 
If you account for this, you get the result I mentioned.

Alex

  ----- Original Message ----- 
  From: Eugene Loh 
  To: Open MPI Users 
  Sent: Thursday, September 09, 2010 11:32 PM
  Subject: Re: [OMPI users] MPI_Reduce performance


  Alex A. Granovsky wrote: 
    Isn't in evident from the theory of random processes and probability theory 
that in the limit of infinitely 
    large cluster and parallel process, the probability of deadlocks with 
current implementation is unfortunately 
    quite a finite quantity and in limit approaches to unity regardless on any 
particular details of the program.
  No, not at all.  Consider simulating a physical volume.  Each process is 
assigned to some small subvolume.  It updates conditions locally, but on the 
surface of its simulation subvolume it needs information from "nearby" 
processes.  It cannot proceed along the surface until it has that neighboring 
information.  Its neighbors, in turn, cannot proceed until their neighbors have 
reached some point.  Two distant processes can be quite out of step with one 
another, but only by some bounded amount.  At some point, a leading process has 
to wait for information from a laggard to propagate to it.  All processes 
proceed together, in some loose lock-step fashion.  Many applications behave in 
this fashion.  Actually, in many applications, the synchronization is tightened 
in that "physics" is made to propagate faster than neighbor-by-neighbor.

  As the number of processes increases, the laggard might seem relatively 
slower in comparison, but that isn't deadlock.

  As the size of the cluster increases, the chances of a system component 
failure increase, but that also is a different matter.



------------------------------------------------------------------------------


  _______________________________________________
  users mailing list
  us...@open-mpi.org
  http://www.open-mpi.org/mailman/listinfo.cgi/users

Reply via email to