Isn't in evident from the theory of random processes and probability theory 
that in the limit of infinitely 
large cluster and parallel process, the probability of deadlocks with current 
implementation is unfortunately 
quite a finite quantity and in limit approaches to unity regardless on any 
particular details of the program.

Just my two cents.
Alex Granovsky

  ----- Original Message ----- 
  From: Richard Treumann 
  To: Open MPI Users 
  Sent: Thursday, September 09, 2010 10:10 PM
  Subject: Re: [OMPI users] MPI_Reduce performance



  I was pointing out that most programs have some degree of elastic 
synchronization built in. Tasks (or groups or components in a coupled model) 
seldom only produce data.they also consume what other tasks produce and that 
limits the potential skew.   

  If step n for a task (or group or coupled component) depends on data produced 
by step n-1 in another task  (or group or coupled component)  then no task can 
be farther ahead of the task it depends on than one step.   If there are 2 
tasks that each need the others step n-1 result to compute step n then they can 
never get farther than one step out of synch.  If there were a rank ordered 
loop of  8 tasks so each one needs the output of the prior step on task ((me-1) 
 mod tasks) to compute then you can get more skew because if 
  task 5 gets stalled in step 3, 
  task 6 will finish step 3 and send results to 7 but stall on recv for step 4 
(lacking the end of step 3 send by task 5) 
  task 7 will finish step 4 and send results to 0  but stall on recv for step 5 
  task 0 will finish step 5 and send results to 1  but stall on recv for step 6 
  etc 

  In a 2D or 3D grid, the dependency is tighter so the possible skew is less. 
but it is still significant on a huge grid   In a program with frequent calls 
to MPI_Allreduce on COMM_WORLD, the skew is very limited. The available skew 
gets harder to predict as the interdependencies grow more complex. 

  I call this "elasticity" because the amount of stretch varies but, like a 
bungee cord or an waist band, only goes so far. Every parallel program has some 
degree of elasticity built into the way its parts interact. 

  I assume a coupler has some elasticity too. That is, ocean and atmosphere 
each model Monday and report in to coupler but neither can model Tuesday until 
they get some of the Monday results generated by the other. (I am pretending 
granularity is day by day)  Wouldn't the right level of synchronization among 
component result automatically form the data dependencies among them? 

    

  Dick Treumann  -  MPI Team           
  IBM Systems & Technology Group
  Dept X2ZA / MS P963 -- 2455 South Road -- Poughkeepsie, NY 12601
  Tele (845) 433-7846         Fax (845) 433-8363



        From:  Eugene Loh <eugene....@oracle.com>  
        To:  Open MPI Users <us...@open-mpi.org>  
        Date:  09/09/2010 12:40 PM  
        Subject:  Re: [OMPI users] MPI_Reduce performance  
        Sent by:  users-boun...@open-mpi.org 


------------------------------------------------------------------------------



  Gus Correa wrote:

  > More often than not some components lag behind (regardless of how
  > much you tune the number of processors assigned to each component),
  > slowing down the whole scheme.
  > The coupler must sit and wait for that late component,
  > the other components must sit and wait for the coupler,
  > and the (vicious) "positive feedback" cycle that
  > Ashley mentioned goes on and on.

  I think "sit and wait" is the "typical" scenario that Dick mentions.  
  Someone lags, so someone else has to wait.

  In contrast, the "feedback" cycle Ashley mentions is where someone lags 
  and someone else keeps racing ahead, pumping even more data at the 
  laggard, forcing the laggard ever further behind.
  _______________________________________________
  users mailing list
  us...@open-mpi.org
  http://www.open-mpi.org/mailman/listinfo.cgi/users





------------------------------------------------------------------------------


  _______________________________________________
  users mailing list
  us...@open-mpi.org
  http://www.open-mpi.org/mailman/listinfo.cgi/users

Reply via email to