Jeff, you were right. I did a series of Spawns and consecutive Merges and forgot to set the exception handler with the newly created intra-communicators. Since these properties obviously are not inherited (which would be kind of hard considering that there are multiple communicators to be merged), the default non-exception-throwing handler was installed.
Thanks! Murat Jeff Squyres schrieb: > On Nov 7, 2007, at 7:43 PM, Murat Knecht wrote: > > >> when MPI_Spawn cannot launch an application for whatever reason, the >> entire job is cancelled with some message like the following. >> > > That is correct; MPI states that the default error handler is > MPI_ERRORS_ABORT. > > >> Is there a way to handle this nicely, e.g. by throwing an exception? I >> > > Sure; change the default error handler on the communicator in which > you are using in the call to COMM_SPAWN. > > I don't know if we have checked this particular code path to ensure > that OMPI will be stable after this, but it might work... > > >> understand, this does not work, when the job is first started with >> mpirun, as there is no application yet to fall back on, but in case >> of a >> running application, it should be possible to simply inform it that >> the >> spawning request failed. Then the application could begin to handle >> the >> error and terminate gracefully. I did enable C++ Exceptions btw, so I >> guess this is not implemented. Is there a technical (e.g. >> architectural) >> reason behind this, or simply a yet-to-be-added feature? >> > > The MPI layer is written in C; it will not throw exceptions unless you > use the MPI C++ bindings to enable the MPI::ERRORS_THROW_EXCEPTIONS > error handler. Also be sure to use the right compiler flags to enable > the C compiler to propagate C++ exceptions when you configure/build > Open MPI via the --enable-cxx-exceptions flag (it's not enabled by > default because it imposes a slight performance penalty). > >