[O-MPI users] Questions about Open MPI

2005-05-05 Thread atarpley
Hello,

1) When will the final Open MPI be released (non development)?

2) What fault tolerance mechanisms will be included?  Specifically, if a node 
goes down, what happens?  Will everything bomb?

3) There will be FULL multi-threading support, correct?

Thats it for now.  Thank you.



Re: [O-MPI users] Questions about Open MPI

2005-05-05 Thread George Bosilca


On May 5, 2005, at 7:58 AM, atarpley wrote:



1) When will the final Open MPI be released (non development)?



As soon as everybody is happy with the stability and the features of  
a version. And like for most of the HPC software, SC05 seems like a  
reasonable deadline. Meanwhile, a beta version will be released soon  
(no specific deadline available at the moment).



2) What fault tolerance mechanisms will be included?  Specifically,  
if a node

goes down, what happens?  Will everything bomb?



Several models of fault tolerance will be included. Maybe not on the  
first release but there are several teams already working on such  
projects. A short list of available fault tolerance mechanisms follow:

1. a coordinated checkpointing - a Chandy-Lamport (a la LAM)
2. an uncoordinated one (a la MPICH-V)
3. and one similar with FT-MPI.
In few words: most of the usual fault-tolerance mechanisms will be  
included.


The behavior of the application when a node goes down depend on the  
user choice (via parameters at the initialization time). If the user  
let the error handler on the MPI communicators to fatal then of  
course everything will get destroyed by the Open MPI runtime  
environment. Otherwise, one (depending again on user parameters) of  
the fault tolerance mechanisms will take care of the rest of the  
execution.





3) There will be FULL multi-threading support, correct?



Correct except that the FULL multi-threading support is already  
inside. We are currently testing the multi-threaded support for all  
of the drivers (only TCP is considered to be multi-threaded  
compliant). The next step will be to look at the performances, as we  
are using fine grained locking mechanisms.

This feature will definitively be in the stable release.

  Thanks,
george.


"We must accept finite disappointment, but we must never lose infinite
hope."
  Martin Luther King