Hi,

Am 23.04.2011 um 04:31 schrieb Pablo Lopez Rios:

> I'm having a bit of a problem with wrapping mpirun in a script. The script 
> needs to run an MPI job in the background and tail -f the output. Pressing 
> Ctrl+C should stop tail -f, and the MPI job should continue. However mpirun 
> seems to detect the SIGINT that was meant for tail, and kills the job 
> immediately. I've tried workarounds involving nohup, disown, trap, subshells 
> (including calling the script from within itself), etc, to no avail.
> 
> The problem is that this doesn't happen if I run the command directly 
> instead, without mpirun. Attached is a script that reproduces the problem. It 
> runs a simple counting script in the background which takes 10 seconds to 
> run, and tails the output. If called with "nompi" as first argument, it will 
> simply run bash -c "$SCRIPT" >& "$out" &, and with "mpi" it will do the same 
> with 'mpirun -np 1' prepended. The output I get is:

what about:

( trap "" sigint; exec mpiexec ...) &

i.e. replace the subshell with changed interrupt handling with the mpiexec. 
Well, maybe mpiexec is adjusting it on its own again. This can be checked in 
/proc/<pid>/status

-- Reuti

> 
> $ ./ompi_bug.sh mpi
> mpi:
> 1
> 2
> 3
> 4
> ^C
> $ ./ompi_bug.sh nompi
> nompi:
> 1
> 2
> 3
> 4
> ^C
> $ cat output.*
> mpi:
> 1
> 2
> 3
> 4
> mpirun: killing job...
> 
> --------------------------------------------------------------------------
> mpirun noticed that process rank 0 with PID 1222 on node pablomme exited on 
> signal 0 (Unknown signal 0).
> --------------------------------------------------------------------------
> mpirun: clean termination accomplished
> 
> nompi:
> 1
> 2
> 3
> 4
> 5
> 6
> 7
> 8
> 9
> 10
> Done
> 
> 
> This convinces me that there is something strange with OpenMPI, since I 
> expect no difference in signal handling when running a simple command with or 
> without mpirun in the middle.
> 
> I've tried looking for options to change this behaviour, but I don't seem to 
> find any. Is there one, preferably in the form of an environment variable? Or 
> is this a bug?
> 
> I'm using OpenMPI v1.4.3 as distributed with Ubuntu 11.04, and also v1.2.8 as 
> distributed with OpenSUSE 11.3.
> 
> Thanks,
> Pablo
> <ompi_bug.sh.gz>_______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


Reply via email to