On Mar 25, 2012, at 10:57 AM, Júlio Hoffimann wrote: > I forgot to mention, i tried to set the odls_base_sigkill_timeout as you > told, even 5s was not sufficient for the root execute it's task, and most > important, the kill was instantaneous, there is no 5s hang. My erroneous > conclusion: SIGKILL was being sent instead of SIGTERM.
Which version are you using? Could be a bug in there - I can take a look. > > About the man page, at least for me, the word "kill" is not clear. The > SIGTERM+SIGKILL keywords would be unambiguous. I'll clarify it - thanks! > > Regards, > Júlio. > > 2012/3/25 Ralph Castain <r...@open-mpi.org> > > On Mar 25, 2012, at 7:19 AM, Júlio Hoffimann wrote: > >> Dear Ralph, >> >> Thank you for your prompt reply. I confirmed what you just said by reading >> the mpirun man page at the sections Signal Propagation and Process >> Termination / Signal Handling. >> >> "During the run of an MPI application, if any rank dies abnormally >> (either exiting before invoking MPI_FINALIZE, or dying as the result of a >> signal), mpirun will print out an error message and kill the rest of the >> MPI application." >> >> If i understood correctly, the SIGKILL signal is sent to every process on a >> premature death. > > Each process receives a SIGTERM, and then a SIGKILL if it doesn't exit within > a specified time frame. I told you how to adjust that time period in the > prior message. > >> In my point of view, i consider this a bug. If OpenMPI allows handling >> signals such as SIGTERM, the other processes in the communicator should also >> have the opportunity to die prettily. Perhaps i'm missing something? > > Yes, you are - you do get a SIGTERM first, but you are required to exit in a > timely fashion. You are not allowed to continue running. This is required in > order to ensure proper cleanup of the job, per the MPI standard. > >> >> Supposing the described behaviour in the last paragraph, i think would be >> great to explicitly mention the SIGKILL in the man page, or even better, fix >> the implementation to send SIGTERM instead, making possible for the user >> cleanup all processes before exit. > > We already do, as described above. > >> >> I solved my particular problem by adding another flag >> unexpected_error_on_slave: >> >> volatile sig_atomic_t unexpected_error_occurred = 0; >> int unexpected_error_on_slave = 0; >> enum tag { work_tag, die_tag } >> >> void my_handler( int sig ) >> { >> unexpected_error_occurred = 1; >> } >> >> // >> // somewhere in the code... >> // >> >> signal(SIGTERM, my_handler); >> >> if (root process) { >> >> // do stuff >> >> world.recv(mpi::any_source, die_tag, unexpected_error_on_slave); >> if ( unexpected_error_occurred || unexpected_error_on_slave ) { >> >> // save something >> >> world.abort(SIGABRT); >> } >> } >> else { // slave process >> >> // do different stuff >> >> if ( unexpected_error_occurred ) { >> >> // just communicate the problem to the root >> world.send(root,die_tag,1); >> signal(SIGTERM,SIG_DFL); >> while(true) >> ; // wait, master will take care of this >> } >> world.send(root,die_tag,0); // everything is fine >> } >> >> signal(SIGTERM, SIG_DFL); // reassign default handler >> >> // continues the code... >> >> Note the slave must hang for the store operation get executed at the root, >> otherwise we back for the previous scenario. It's theoretically unnecessary >> send MPI messages to accomplish the desired cleanup, and in more complex >> applications this can turn into a nightmare. As we know, asynchronous events >> are insane to debug. >> >> Best regards, >> Júlio. >> >> P.S.: MPI 1.4.3 from Ubuntu 11.10 repositories. >> >> 2012/3/23 Ralph Castain <r...@open-mpi.org> >> Well, yes and no. When a process abnormally terminates, OMPI will kill the >> job - this is done by first hitting each process with a SIGTERM, followed >> shortly thereafter by a SIGKILL. So you do have a short time on each process >> to attempt to cleanup. >> >> My guess is that your signal handler actually is getting called, but we then >> kill the process before you can detect that it was called. >> >> You might try adjusting the time between sigterm and sigkill using the >> odls_base_sigkill_timeout MCA param: >> >> mpirun -mca odls_base_sigkill_timeout N >> >> should cause it to wait for N seconds before issuing the sigkill. Not sure >> if that will help or not - it used to work for me, but I haven't tried it >> for awhile. What versions of OMPI are you using? >> >> >> On Mar 22, 2012, at 4:49 PM, Júlio Hoffimann wrote: >> >>> Dear all, >>> >>> I'm trying to handle signals inside a MPI task farming model. Following is >>> a pseudo-code of what i'm trying to achieve: >>> >>> volatile sig_atomic_t unexpected_error_occurred = 0; >>> >>> void my_handler( int sig ) >>> { >>> unexpected_error_occurred = 1; >>> } >>> >>> // >>> // somewhere in the code... >>> // >>> >>> signal(SIGTERM, my_handler); >>> >>> if (root process) { >>> >>> // do stuff >>> >>> if ( unexpected_error_occurred ) { >>> >>> // save something >>> >>> // reraise the SIGTERM again, but now with the default handler >>> signal(SIGTERM, SIG_DFL); >>> raise(SIGTERM); >>> } >>> } >>> else { // slave process >>> >>> // do different stuff >>> >>> if ( unexpected_error_occurred ) { >>> >>> // just propragate the signal to the root >>> signal(SIGTERM, SIG_DFL); >>> raise(SIGTERM); >>> } >>> } >>> >>> signal(SIGTERM, SIG_DFL); // reassign default handler >>> >>> // continues the code... >>> >>> As can be seen, the signal handling is required for implementing a restart >>> feature. All the problem resides in the assumption i made that all >>> processes in the communicator will receive a SIGTERM as a side effect. Is >>> it a valid assumption? How the actual MPI implementation deals with such >>> scenarios? >>> >>> I also tried to replace all the raise() calls by MPI_Abort(), which >>> according to the documentation >>> (http://www.open-mpi.org/doc/v1.5/man3/MPI_Abort.3.php), sends a SIGTERM to >>> all associated processes. The undesired behaviour persists: when killing a >>> slave process, the save section in the root branch is not executed. >>> >>> Appreciate any help, >>> Júlio. >>> _______________________________________________ >>> users mailing list >>> us...@open-mpi.org >>> http://www.open-mpi.org/mailman/listinfo.cgi/users >> >> >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users >> >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users > > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users