On Jun 29, 2010, at 3:44 AM, 王睿 wrote: > 1, suppose a MPI program involves several nodes, if one node dead, will the > program terminate?
Open MPI will terminate the whole job, yes. > 2, Is there any possibility to extend or shrink the size of MPI communicator > size? If so, we can use spare node to replace the dead node? Currently, no. Fault tolerance and resiliency is an active topic of research and discussion in the MPI-3 forum. But for the moment, most MPI implementations -- including Open MPI -- have fairly draconian responses to the loss of a process and/or node (i.e., kill the rest of the job). -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/