I forgot to mention, i tried to set the odls_base_sigkill_timeout as you told, even 5s was not sufficient for the root execute it's task, and most important, the kill was instantaneous, there is no 5s hang. My erroneous conclusion: SIGKILL was being sent instead of SIGTERM.
About the man page, at least for me, the word "kill" is not clear. The SIGTERM+SIGKILL keywords would be unambiguous. Regards, Júlio. 2012/3/25 Ralph Castain <r...@open-mpi.org> > > On Mar 25, 2012, at 7:19 AM, Júlio Hoffimann wrote: > > Dear Ralph, > > Thank you for your prompt reply. I confirmed what you just said by reading > the mpirun man page at the sections *Signal Propagation* and *Process > Termination / Signal Handling*. > > "During the run of an MPI application, if any rank dies > abnormally (either exiting before invoking MPI_FINALIZE, or dying as the > result of a signal), mpirun will print out an error message and kill the > rest of the MPI application." > > If i understood correctly, the SIGKILL signal is sent to every process on > a premature death. > > > Each process receives a SIGTERM, and then a SIGKILL if it doesn't exit > within a specified time frame. I told you how to adjust that time period in > the prior message. > > In my point of view, i consider this a bug. If OpenMPI allows handling > signals such as SIGTERM, the other processes in the communicator should > also have the opportunity to die prettily. Perhaps i'm missing something? > > > Yes, you are - you do get a SIGTERM first, but you are required to exit in > a timely fashion. You are not allowed to continue running. This is required > in order to ensure proper cleanup of the job, per the MPI standard. > > > Supposing the described behaviour in the last paragraph, i think would be > great to explicitly mention the SIGKILL in the man page, or even better, > fix the implementation to send SIGTERM instead, making possible for the > user cleanup all processes before exit. > > > We already do, as described above. > > > I solved my particular problem by adding another flag * > unexpected_error_on_slave*: > > volatile sig_atomic_t unexpected_error_occurred = 0;int > unexpected_error_on_slave = 0;enum tag { work_tag, die_tag } > void my_handler( int sig ){ > unexpected_error_occurred = 1;} > //// somewhere in the code...// > signal(SIGTERM, my_handler); > if (root process) { > > // do stuff > > world.recv(mpi::any_source, die_tag, unexpected_error_on_slave); > if ( unexpected_error_occurred || unexpected_error_on_slave ) { > > // save something > > world.abort(SIGABRT); > }}else { // slave process > > // do different stuff > > if ( unexpected_error_occurred ) { > > // just communicate the problem to the root > world.send(root,die_tag,1); > signal(SIGTERM,SIG_DFL); > while(true) > ; // wait, master will take care of this > } > world.send(root,die_tag,0); // everything is fine} > signal(SIGTERM, SIG_DFL); // reassign default handler > // continues the code... > > > Note the slave must hang for the store operation get executed at the root, > otherwise we back for the previous scenario. It's theoretically unnecessary > send MPI messages to accomplish the desired cleanup, and in more complex > applications this can turn into a nightmare. As we know, asynchronous > events are insane to debug. > > Best regards, > Júlio. > > P.S.: MPI 1.4.3 from Ubuntu 11.10 repositories. > > 2012/3/23 Ralph Castain <r...@open-mpi.org> > >> Well, yes and no. When a process abnormally terminates, OMPI will kill >> the job - this is done by first hitting each process with a SIGTERM, >> followed shortly thereafter by a SIGKILL. So you do have a short time on >> each process to attempt to cleanup. >> >> My guess is that your signal handler actually is getting called, but we >> then kill the process before you can detect that it was called. >> >> You might try adjusting the time between sigterm and sigkill using >> the odls_base_sigkill_timeout MCA param: >> >> mpirun -mca odls_base_sigkill_timeout N >> >> should cause it to wait for N seconds before issuing the sigkill. Not >> sure if that will help or not - it used to work for me, but I haven't tried >> it for awhile. What versions of OMPI are you using? >> >> >> On Mar 22, 2012, at 4:49 PM, Júlio Hoffimann wrote: >> >> Dear all, >> >> I'm trying to handle signals inside a MPI task farming model. Following >> is a pseudo-code of what i'm trying to achieve: >> >> volatile sig_atomic_t unexpected_error_occurred = 0; >> void my_handler( int sig ){ >> unexpected_error_occurred = 1;} >> //// somewhere in the code...// >> signal(SIGTERM, my_handler); >> if (root process) { >> >> // do stuff >> >> if ( unexpected_error_occurred ) { >> >> // save something >> >> // reraise the SIGTERM again, but now with the default handler >> signal(SIGTERM, SIG_DFL); >> raise(SIGTERM); >> }}else { // slave process >> >> // do different stuff >> >> if ( unexpected_error_occurred ) { >> >> // just propragate the signal to the root >> signal(SIGTERM, SIG_DFL); >> raise(SIGTERM); >> }} >> signal(SIGTERM, SIG_DFL); // reassign default handler >> // continues the code... >> >> >> As can be seen, the signal handling is required for implementing a >> restart feature. All the problem resides in the assumption i made that all >> processes in the communicator will receive a SIGTERM as a side effect. Is >> it a valid assumption? How the actual MPI implementation deals with such >> scenarios? >> >> I also tried to replace all the raise() calls by MPI_Abort(), which >> according to the documentation ( >> http://www.open-mpi.org/doc/v1.5/man3/MPI_Abort.3.php), sends a SIGTERM >> to all associated processes. The undesired behaviour persists: when killing >> a slave process, the save section in the root branch is not executed. >> >> Appreciate any help, >> Júlio. >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users >> >> >> >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users >> > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users > > > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users >