Re: [OMPI users] Fault Tolerance & Behavior

2006-10-30 Thread Troy Telford
On Sun, 29 Oct 2006 01:34:06 -0700, Gleb Natapov wrote: If you use OB1 PML (default one) it will never recover from link down error no matter how many other transports you have. The reason is that OB1 never tracks what happens with buffers submitted to BTL. So if BTL can't, for any reason, tr

Re: [OMPI users] MPI_Comm_spawn multiple bproc support problem

2006-10-30 Thread Ralph H Castain
On 1.1.2, what that error is telling you is that it didn't find any nodes in the environment. The bproc allocator looks for an environmental variable NODES that contains a list of nodes assigned to you. This error indicates it didn't find anything. Did you get an allocation prior to running the jo

[OMPI users] MPI_Comm_spawn multiple bproc support problem

2006-10-30 Thread hpe...@infonie.fr
Hi, I have a problem using the MPI_Comm_spawn multiple together with bproc. I want to use the MPI_Comm_spawn multiple call to spawn a set of exe, but in a bproc environment, the program crashes or is stuck on this call (depending of the used open mpi release). I have created one test program th

Re: [OMPI users] BLACS vs. OpenMPI 1.1.1 & 1.3

2006-10-30 Thread Michael Kluskens
I have tested for the MPI_ABORT problem I was seeing and it appears to be fixed in the trunk. Michael On Oct 28, 2006, at 8:45 AM, Jeff Squyres wrote: Sorry for the delay on this -- is this still the case with the OMPI trunk? We think we finally have all the issues solved with MPI_ABORT on