Dear open-mpi users,
I am using open-mpi in conjunction with the mpi4py package to run
parallel simulations using python on my local machine.
I use the following idiom:
mpiexec -np 4 python myscript.py
When I hit ^C during the execution of the above command, the mpi program
is interrupted, and the python programs are also interrupted.
However, I get no traceback from the python programs, and more
problematically, the cleanup functions of these programs are not
executed as they should when these programs get interrupted.
The open-mpi documentation states that: "When orterun (<=> mpiexec <=>
mpirun) receives a SIGTERM and SIGINT, it will attempt to kill the
entire job by sending all processes in the job a SIGTERM, waiting a
small number of seconds, then sending all processes in the job a SIGKILL."
Thus, the python programs receive a SIGTERM signal instead of the SIGINT
signal that they would receive upon hitting ^C during an execution
launched with the idiom:
python myscript.py
I know that there is a way to make the python programs handle the
SIGTERM signal as if it was a SIGINT signal (namely, raising a
KeyboardInterrupt), but I would prefer to be able to configure mpiexec
to propagate the SIGINT signal it receives instead of sending a SIGTERM
signal to its children processes.
Would you know how this could be achieved?
Thank you very much for your time and help,
Nathan GREINER
PS: I am new to the open-mpi users mailing list: is this the right place
and way to ask such a question?