Jeff, you were right. I did a series of Spawns and consecutive Merges
and forgot to set the exception handler with the newly created
intra-communicators. Since these properties obviously are not inherited
(which would be kind of hard considering that there are multiple
communicators to be merged), the default non-exception-throwing handler
was installed.

Thanks!

Murat


Jeff Squyres schrieb:
> On Nov 7, 2007, at 7:43 PM, Murat Knecht wrote:
>
>   
>> when MPI_Spawn cannot launch an application for whatever reason, the
>> entire job is cancelled with some message like the following.
>>     
>
> That is correct; MPI states that the default error handler is  
> MPI_ERRORS_ABORT.
>
>   
>> Is there a way to handle this nicely, e.g. by throwing an exception? I
>>     
>
> Sure; change the default error handler on the communicator in which  
> you are using in the call to COMM_SPAWN.
>
> I don't know if we have checked this particular code path to ensure  
> that OMPI will be stable after this, but it might work...
>
>   
>> understand, this does not work, when the job is first started with
>> mpirun, as there is no application yet to fall back on, but in case  
>> of a
>> running application, it should be possible to simply inform it that  
>> the
>> spawning request failed. Then the application could begin to handle  
>> the
>> error and terminate gracefully. I did enable C++ Exceptions btw, so I
>> guess this is not implemented. Is there a technical (e.g.  
>> architectural)
>> reason behind this, or simply a yet-to-be-added feature?
>>     
>
> The MPI layer is written in C; it will not throw exceptions unless you  
> use the MPI C++ bindings to enable the MPI::ERRORS_THROW_EXCEPTIONS  
> error handler.  Also be sure to use the right compiler flags to enable  
> the C compiler to propagate C++ exceptions when you configure/build  
> Open MPI via the --enable-cxx-exceptions flag (it's not enabled by  
> default because it imposes a slight performance penalty).
>
>   

Reply via email to