On Jun 29, 2010, at 3:44 AM, 王睿 wrote:

> 1, suppose a MPI program involves several nodes, if one node dead, will the 
> program terminate? 

Open MPI will terminate the whole job, yes.

> 2, Is there any possibility to extend or shrink the size of MPI communicator 
> size? If so, we can use spare node to replace the dead node?  

Currently, no.

Fault tolerance and resiliency is an active topic of research and discussion in 
the MPI-3 forum.  But for the moment, most MPI implementations -- including 
Open MPI -- have fairly draconian responses to the loss of a process and/or 
node (i.e., kill the rest of the job).

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/


Reply via email to